Revision 1 as of 2009-10-12 11:10:26
|Deletions are marked like this.||Additions are marked like this.|
|Line 5:||Line 5:|
|* `boxen.math` --- mainly virtual machines and web services||* `boxen.math` --- mainly for virtual machines and web services|
Guidelines for Using the Sage Cluster
This document sets out guidelines for using the Sage cluster. The Sage cluster consists of four similar computers:
boxen.math --- mainly for virtual machines and web services
geom.math --- mainly for geometry research
mod.math --- mainly for number theory research
sage.math --- mainly for Sage development
The machine sage.math is primarily for Sage development. Ideally, you should use that machine to develop code, upgrade/update packages, porting packages/code, reviewing/working on tickets, etc. If you have a long job to run on the Sage cluster, first consider whether your job is related to any of these goals.
Some questions relating to using any of the above machines include:
If the job would take days or weeks or longer, does it relate to number-theoretic computation? If so, then mod.math is the machine to use as its stated purpose is for number theory research, which also includes number-theoretic computation.
Does your job relate to geometry computation? If so, then geom.math is the machine to use, since that is its intended purpose.
Most of the time, you shouldn't run long jobs on boxen.math because that machine is for web services. We want to minimize the downtime of the public notebook server, the Sage wiki server, the trac bug server, the Sage main website, and websites of other projects hosted on boxen.math. Please first consider using geom.math or mod.math before running long jobs on sage.math.
The machines mod.math and geom.math can be used for running very long jobs. Running long jobs on any of those machines would minimize disruption to your long jobs because release managers don't usually compile, run and doctest Sage on any of those machines, unless absolutely necessary. Running long jobs on sage.math would result in disruption to long running jobs because many people actually use sage.math to compile, run and doctest Sage. Doctesting Sage is usually performed in parallel, which can take away computing time from other running jobs.
Running a long job on the machine sage.math --- where the job can take days, weeks, or months --- can significantly affect the development, compilation, and doctesting of the Sage library. When you work on a ticket, whether that be developing code or reviewing other people's code, you can use sage.math to parallel doctest the Sage library with that new code using 6 to 10 threads. This should significantly reduce the development and doctesting time from about 3 to 6 hours with one thread, to about 30 minutes with 16 threads.
The sooner that tickets and code get merged in Sage, the sooner that users get to use new code and be grateful to developers, patch authors and reviewers for providing useful software. So before running any long jobs on sage.math, please consider whether a job can be run on any of the other machines instead.
Navigating between machines
From any of the machines on the Sage cluster, you can ssh to any of the other three machines. Whenever ssh'ing to another server, you should use the syntax
ssh -C -x -a <remote-machine>
Here's an explanation of these options:
With the option "-C", all data transferred between your local machine and the remote server are compressed. The option to compress data comes in really handy if you have a limited Internet quota.
The option "-x" disables X11 forwarding. If you don't want to transfer X11 graphical data between machines, you should explicitly disable X11 forwarding. With a text-based SSH session, X11 forwarding involves transferring more data than necessary.
The option "-a" disables the forwarding of the authentication agent. If you ssh from one server to another server, agent forwarding isn't good for security reasons. An attacker who has compromized the second server can then work back to the first server and get access to two servers just because you forwarded the authentication agent.