(Preliminary) SEP 2: Optimizing the SAGE library using Parallel Techniques

AUTHOR: William Stein

COPYRIGHT: GNU Free Documentation License, 2007.

Basic Principles

  1. Parallel methods should be viewed as a means to an end -- speedups. Never parallelize any computation except to speed up a calculation beyond what can be done using sequential techniques.
  2. Parallel methods should never completely replace sequential implementations. Parallel algorithms are often very complicated to understand and test, so we need to at a minimum have a randomized test function that compares with the output of purely sequential code.

  3. Do not write extremely complicated parallel code that nobody can understand or maintain. Because SAGE is an open source system that is widely developed, it is crucial that it be readable.

  4. It is *crucial* that implementation of parallel methods in SAGE have the following properties:
    • It can be done incrementally. One must be able to start with almost any specific operation or algorithm in SAGE and make a parallel version without having to drastically change code all over SAGE. Any proposed solutions that violate this fails our needs.
    • It doesn't depend on any libraries or tools that are not open source and free, and all dependencies must work on the SAGE target platforms: Linux, OS X, Windows, (and soon Solaris).
    • All dependencies for parallel algorithms must be included standard with SAGE.

Architecture

We propose that parallel optimizations for SAGE are carried out (in parallel!) using three complementary approaches: fine (multithreaded), medium (mpi), and coarse (dsage task farming).

COMMENTS:

  1. Collapse the two bottom levels (i.e., thread and message passing)!
  2. Avoid threads if at all possible!
    • But sometimes use them.
    • Parallel databases?!
    • Scheduling of threads is very hard.
    • Scaling not the same for threads... (something is possible but...)
  3. MPI -- harder to write, but maybe less subtle.
  4. Not at all clear how to manage how parallel each component should be.
  5. Be realistic. Don't try to solve the general problem...
  6. If you want up and running soon (I do!); the only choice is MPI. Duh.
  7. Definitely want support library to make it easy to use MPI by programmers.
  8. SAGE -- target audience = working mathematician researcher; quick prototyping; not an industrial applications...
  9. Resources -- evil malloc; pthreads can very quickly give your application level info. Need some sort of "how much 'parallel power' is available". Be single threaded....
  1. TWO LEVELS: i.e., collapse the top two.
    • coarse -- proudly parallel
    • fine -- maybe use gasnet.
  2. THREE LEVELS OF USERS:
    1. task farmer
    2. i do parallel programming, e.g., using SAGE constructs.
    3. i use a function that happens to be parallel (and maybe don't know)
  3. MPI:
    • you might think it's static from tutorials...
    • but it's dynamic (!) MPI-2.* only though.

    • parallel programming is hard.
  4. pthreads -- programming model is "simpler", since you can ask for bazillions of threads... and get a speedup.
  5. java is nicely parallel already, so why don't you just write everything in java!! because: (1) maintenance issues and frustration, (2)
  6. cilk
  7. what will help the target (research mathematicians) the most with their research?? (task farming.) Enable people to do what they would otherwise not do.
  8. distinguish: ideal versus reality -- KISS.
  9. Task farming is first thing people are interested in. Step 1. BUT, then I will want to *parallelize my algorithms* transparently. Optimize iteratively.
  10. Loosing the intermediate level:
    • - low level: - Python people LOOSE something in between - coarse grain:
  11. moving around complicated mathematical objects is interesting and subtle.

1. Fine level -- shared memory (mostly multicore desktop/laptop)

Proposed tool: the standard POSIX thread library pthread

Justification:

Design issues:

Problems:

2. Medium level -- homogeneous trusted cluster

Proposed tool: ipython1 with MPI under the hood.

Justification:

Example problems:

3. Coarse level -- heterogenous task farm (both trusted and untrusted)

Proposed tool: dsage

Justification:

Example problems: