Processing Math: Done
No jsMath TeX fonts found -- using unicode fonts instead.
This may be slow and might not print well.
Use the jsMath control panel to get additional information.
jsMath Control PanelHide this Message


jsMath
Differences between revisions 7 and 20 (spanning 13 versions)
Revision 7 as of 2008-06-13 04:11:12
Size: 1301
Comment:
Revision 20 as of 2008-11-14 13:41:56
Size: 4048
Editor: anonymous
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
  * Michael Abshoff
  * Martin Albrecht
  * Nick Alexander
  * Gregory Bard
  * Michael Abshoff -> LinBox debianization
  * MartinAlbrecht -> M4RI GF(2)
  * Nick Alexander -> cylcotomic linalg
  * Gregory Bard -> M4RI
Line 9: Line 9:
  * Craig Citro
  * Burcin Erocal
  * Robert Miller (especially sparse GF(2))
  * David Roe (p-adic linear algebra?)
  * Arne Storjohann
  * William Stein
  * Ralf-Philipp Weinmann
  * Craig Citro -> cyclotomic linalg, p-adic linalg
  * Burcin Erocal -> nullspace of polynomial matrices
  * Robert Miller (especially sparse GF(2)) -> sparse GF(2)
  * David Roe (p-adic linear algebra?) -> cyclotomic linear algebra
  * Arne Storjohann -> HNF, Sparse GF(2) Linalg
  * William Stein -> Cyclotomic linear algebra, HNF (LLL)
  * Ralf-Philipp Weinmann -> Sparse Elimination, Nullspace
Line 17: Line 17:
== GF(2) ==
 * implement LQUP decomposition [Clement, Martin]
   * implement LQUP routine [Clement]
   * implement TRSM routine [Clement]
   * implement efficient column swaps/rotations [Martin]
     * SSE2 might help a lot here
   * implement memory efficient mzd_addmul_strassen [Martin]
== Dense GF(2) ==
 * implement LQUP decomposition [ClementPernet, MartinAlbrecht]
   * --(implement LQUP routine)-- [Clement]
   * implement {{{_mzd_addmul_weird}}} [Clement]
   * implement LQUP basecase routine based on M4RI {{{mzd_lqup_m4ri}}} [MartinAlbrecht]
     * dont' zero out in _mzd_gauss_submatrix (!)
     * don't zero out below
     * remember rowswaps => P
     * remember Q[r] = c (translate to Lapack style later :-) )
   * --(implement TRSM routine)-- [Clement]
   * implement inplace triangular matrix inversion {{{mzd_trtri}}} [Clement]
   * implement triangular triangular matrix multiplication {{{mzd_trtrm}}} [Clement]
   * --(implement column swaps)-- [MartinAlbrecht]
   * --(implement column rotations)-- [MartinAlbrecht]
   * --(implement memory efficient mzd_addmul_strassen)-- [Martin]
Line 25: Line 33:
 * implement Arne's asymptotically fast elimination algorithm [Martin]  * --(implement Arne's asymptotically fast elimination algorithm)-- [MartinAlbrecht]
   * It was agreed that this is not what we want for M4RI, a toy implementation is available at: http://www.informatik.uni-bremen.de/cgi-bin/cgiwrap/malb/blosxom.pl/2008/06/13#gauss-jordan-storjohann
Line 30: Line 39:
   * try 7 instead of 8 Gray code tables to leave room for the actual matrix in L1
   * try to fit three matrices rather than two into L2 or understand why it works so good for two
   * detect L1/L2 cache sizes at runtime and choose optimal parameters for them
   * implement Bill's half table idea and benchmark it
   * --(try 7 instead of 8 Gray code tables to leave room for the actual matrix in L1)-- [MartinAlbrecht]
     * It seems slower to be slower to use 7 tables rather than 8
   * try say 16 tables instead of 8 on the Core2Duo [MartinAlbrecht]
   * try Bill Hart's idea for L1 cache efficiency on the Core2Duo [MartinAlbrecht]
     * The idea is: construct 8 gray code tables but use one once it is in L1 a lot to reduce before using the second
   * --(try to fit three matrices rather than two into L2 or understand why it works so good for two)-- [MartinAlbrecht]
     * it works since once we've written the date we can go on computing and let a cache handle the rest
     * a cache miss for reading on the other hand really prevents us from computing
   * --(detect L1/L2 cache sizes at runtime and choose optimal parameters for them)-- [MartinAlbrecht]
   * Implement memory management s.t. the sys time on Opterons decreases.
   * implement Bill's [[http://groups.google.com/group/sage-devel/msg/6279228095b3d9f7|half table idea]] and benchmark it

== Sparse GF(2) (and other small finite fields) ==
 [Arne, Ralph, Clément, Rob Miller]
 * Sparse Reduced Echelon form (RPW)
   * Sparse Elimination:
     * improve LinBox gauss-domain
     * eclib sparse elimination
     * ....
 * ....


== Hermite Normal Form ==
 [Arne, Clément, William]
 * new algorithm, based on system solving (Arne Storjohann) -> already an implementation
   * integrate it in IML
   * benchmark it against Sage
 * generalization: block vector system solving
   * implementation
   * benchmark
 * LLL trick for the existing implementation in Sage (William & Clément)

== Polynomial Matrix computations ==
 [Burcin]
 * Nullspace computation (Burcin)

== Cyclotomic linear algebra ==
 [Rob Bradshaw, Craig, William]

== P-adic linear algebra ==
 [David Roe, Craig, David Cohet]

== LinBox debianization and 1.1.6 spkgization ==
 [mabshoff, Clément]
 * get rid of gmp++
 * generate and test linbox-1.1.6rc0.spkg

Dev Days 1: Exact Linear Algebra

  • Michael Abshoff -> LinBox debianization

  • MartinAlbrecht -> M4RI GF(2)

  • Nick Alexander -> cylcotomic linalg

  • Gregory Bard -> M4RI

  • Rob Beezer
  • Tom Boothby
  • Craig Citro -> cyclotomic linalg, p-adic linalg

  • Burcin Erocal -> nullspace of polynomial matrices

  • Robert Miller (especially sparse GF(2)) -> sparse GF(2)

  • David Roe (p-adic linear algebra?) -> cyclotomic linear algebra

  • Arne Storjohann -> HNF, Sparse GF(2) Linalg

  • William Stein -> Cyclotomic linear algebra, HNF (LLL)

  • Ralf-Philipp Weinmann -> Sparse Elimination, Nullspace

Dense GF(2)

  • implement LQUP decomposition [ClementPernet, MartinAlbrecht]

    • implement LQUP routine [Clement]

    • implement _mzd_addmul_weird [Clement]

    • implement LQUP basecase routine based on M4RI mzd_lqup_m4ri [MartinAlbrecht]

      • dont' zero out in _mzd_gauss_submatrix (!)

      • don't zero out below
      • remember rowswaps => P

      • remember Q[r] = c (translate to Lapack style later :-) )

    • implement TRSM routine [Clement]

    • implement inplace triangular matrix inversion mzd_trtri [Clement]

    • implement triangular triangular matrix multiplication mzd_trtrm [Clement]

    • implement column swaps [MartinAlbrecht]

    • implement column rotations [MartinAlbrecht]

    • implement memory efficient mzd_addmul_strassen [Martin]

      • See Clement's et al. paper on memory efficient Strassen-Winograd
  • implement Arne's asymptotically fast elimination algorithm [MartinAlbrecht]

  • implement multi-core multiplication with optimal speed-up
    • OpenMP seems to be nice and easy
    • 2 cores probably main target, but think about 4 cores too
  • improve efficiency of M4RM
    • try 7 instead of 8 Gray code tables to leave room for the actual matrix in L1 [MartinAlbrecht]

      • It seems slower to be slower to use 7 tables rather than 8
    • try say 16 tables instead of 8 on the Core2Duo [MartinAlbrecht]

    • try Bill Hart's idea for L1 cache efficiency on the Core2Duo [MartinAlbrecht]

      • The idea is: construct 8 gray code tables but use one once it is in L1 a lot to reduce before using the second
    • try to fit three matrices rather than two into L2 or understand why it works so good for two [MartinAlbrecht]

      • it works since once we've written the date we can go on computing and let a cache handle the rest
      • a cache miss for reading on the other hand really prevents us from computing
    • detect L1/L2 cache sizes at runtime and choose optimal parameters for them [MartinAlbrecht]

    • Implement memory management s.t. the sys time on Opterons decreases.
    • implement Bill's half table idea and benchmark it

Sparse GF(2) (and other small finite fields)

  • [Arne, Ralph, Clément, Rob Miller]
  • Sparse Reduced Echelon form (RPW)
    • Sparse Elimination:
      • improve LinBox gauss-domain

      • eclib sparse elimination
      • ....
  • ....

Hermite Normal Form

  • [Arne, Clément, William]
  • new algorithm, based on system solving (Arne Storjohann) -> already an implementation

    • integrate it in IML
    • benchmark it against Sage
  • generalization: block vector system solving
    • implementation
    • benchmark
  • LLL trick for the existing implementation in Sage (William & Clément)

Polynomial Matrix computations

  • [Burcin]
  • Nullspace computation (Burcin)

Cyclotomic linear algebra

  • [Rob Bradshaw, Craig, William]

P-adic linear algebra

  • [David Roe, Craig, David Cohet]

LinBox debianization and 1.1.6 spkgization

  • [mabshoff, Clément]
  • get rid of gmp++
  • generate and test linbox-1.1.6rc0.spkg

dev1/linalg (last edited 2008-11-14 13:41:56 by anonymous)