Differences between revisions 2 and 8 (spanning 6 versions)

Dev Days 1: Exact Linear Algebra

Michael Abshoff
Martin Albrecht
Nick Alexander
Gregory Bard
Rob Beezer
Tom Boothby
Craig Citro
Burcin Erocal
Robert Miller (especially sparse GF(2))
David Roe (p-adic linear algebra?)
Arne Storjohann
William Stein
Ralf-Philipp Weinmann

Dense GF(2)

implement LQUP decomposition [Clement, Martin]
- implement LQUP routine [Clement]
- implement TRSM routine [Clement]
- implement efficient column swaps/rotations [Martin]
  - SSE2 might help a lot here
- implement memory efficient mzd_addmul_strassen [Martin]
  - See Clement's et al. paper on memory efficient Strassen-Winograd
implement Arne's asymptotically fast elimination algorithm [Martin]
implement multi-core multiplication with optimal speed-up
- OpenMP seems to be nice and easy
- 2 cores probably main target, but think about 4 cores too
improve efficiency of M4RM
- try 7 instead of 8 Gray code tables to leave room for the actual matrix in L1
- try to fit three matrices rather than two into L2 or understand why it works so good for two
- detect L1/L2 cache sizes at runtime and choose optimal parameters for them
- implement Bill's half table idea and benchmark it

Sparse GF(2) (and other small finite fields)

Sparse Reduced Echelon form (RPW)
- Sparse Elimination:
  - improve LinBox gauss-domain
  - eclib sparse elimination
  - ....
....

Hermite Normal Form

new algorithm (Storjohann)

-  ⇤ ← Revision 2 as of 2008-05-20 16:12:09 → 
  Size: 648
  Editor: MartinAlbrecht
  Comment:
+   ← Revision 8 as of 2008-06-13 23:05:54 → ⇥
  Size: 1577
  Editor: Clément Pernet
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 11:
+  * Robert Miller (especially sparse GF(2))
-Line 16:
+Line 17:
-== GF(2) ==
 * Implement Strassen (variant) for dense matrix reduction. The result should be faster than anything else available. Greg has two strategies in mind. We'll implement both and compare their speed.
 * Improve/implement multi-core matrix multiplication and if feasible multi-core reduction
 * If there is time: sparse matrix techniques
+== Dense GF(2) ==
 * implement LQUP decomposition [Clement, Martin]
   * implement LQUP routine [Clement]
   * implement TRSM routine [Clement]
   * implement efficient column swaps/rotations [Martin]
     * SSE2 might help a lot here
   * implement memory efficient mzd_addmul_strassen [Martin]
     * See Clement's et al. paper on memory efficient Strassen-Winograd
 * implement Arne's asymptotically fast elimination algorithm [Martin]
 * implement multi-core multiplication with optimal speed-up
   * OpenMP seems to be nice and easy
   * 2 cores probably main target, but think about 4 cores too
 * improve efficiency of M4RM
   * try 7 instead of 8 Gray code tables to leave room for the actual matrix in L1
   * try to fit three matrices rather than two into L2 or understand why it works so good for two
   * detect L1/L2 cache sizes at runtime and choose optimal parameters for them
   * implement Bill's half table idea and benchmark it

== Sparse GF(2) (and other small finite fields) ==
 * Sparse Reduced Echelon form (RPW)
   * Sparse Elimination: 
     * improve LinBox gauss-domain
     * eclib sparse elimination
     * ....
 * ....

== Hermite Normal Form ==
 * new algorithm (Storjohann)

Diff for "dev1/linalg"

Dev Days 1: Exact Linear Algebra

Dense GF(2)

Sparse GF(2) (and other small finite fields)

Hermite Normal Form