Differences between revisions 1 and 22 (spanning 21 versions)
Revision 1 as of 2010-11-20 22:21:25
Size: 745
Editor: TomBoothby
Comment:
Revision 22 as of 2011-01-12 06:22:32
Size: 3929
Editor: was
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
The notebook was written with the intent of being a local GUI, and as a side-effect of taking advantage of the browser, it's usable over the net. But the performance of sagenb.org is ''terrible'', so we need to make the server more robust overall. The plan is currently to tackle this during [[days27|Sage Days 27]]. The notebook was written with the intent of being a local GUI, and as a side-effect of taking advantage of the browser, it's usable over the net. But the performance of sagenb.org is ''terrible'' since there are around 40,000 users, so we need to make the server much more scalable and robust overall. The plan is currently to tackle this during [[days27|Sage Days 27]].


== Scratch documents ==

  * [[https://docs.google.com/document/d/19R8AEc-l_MytmOduyPpsTxcXk_9cwnStOCrZuFTFgLs/edit?hl=en|Google Docs Scratchpad]] to keep track of tasks

  * [[/worksheet|Idea for how to rewrite worksheet.py in new flask code.]]

== Google Code Repositories ==

 
  * [[http://code.google.com/p/simple-python-db-compute/|A simple Python compute server using flask and a database]]
  * [[http://code.google.com/p/sage-device/|William Stein's sage-device Branch at Google Code]]
  * [[http://code.google.com/r/mhansen-flask/|Mike Hansen's Flask Branch]]
  * [[http://code.google.com/r/wstein-sd27/source/browse|William Stein's Sage Days 27 Notebook Branch]]
  * [[https://github.com/acleone/sageserver|Alex Leone's Github repo]]

== Technologies ==

  * http://projects.unbit.it/uwsgi/ -- micro wsgi
Line 8: Line 28:
 1. Convert all notebook data structures to a database architecture.
 1. rewrite twist.py to use flask: http://flask.pocoo.org/.
 1. Use mod_wsgi and apache to make it scale massively: http://code.google.com/p/modwsgi/.
 1. Convert notebook data structures to a database architecture to allow for concurrent scalable access to a centralized data store by different processes.
 1. Rewrite twist.py to use [[http://flask.pocoo.org/|flask]]. The notebook will then depend on Flask and no longer use Twisted. The main advantage to using flask is the excellent support for mod_wsgi.
 1. Use [[http://code.google.com/p/modwsgi/|mod_wsgi]] and Apache (say) to make the server scale massively.

== Notes ==

 * Be aware of the [[http://groups.google.com/group/sage-devel/browse_thread/thread/e6eb1d3f8b85b4fd|sage-devel discussion]].
 * It is OK if the "highly scalable notebook server" has dependencies that the usual single (or small group) server doesn't have. E.g., depending on Apache and a database (like MongoDB) would be just fine, even though neither will be included in Sage.
 * It is, of course, important that the notebook still have a zero configuration mode, where it works fine for a small number of users, but without any complicated dependencies.
 * We are not going to shoot too high with this project. In particular, our goals do *not* include adding new authentication systems or making it easy to organize worksheets into folders, etc. We just want to solve exactly one problem: make the notebook scalable. This is of course by far the biggest bug with the Sage notebook, and it is only an issue because the notebook is used so heavily. That said, it will be useful to think through how to implement everything on the wishlist before deciding on how to implement scalability, so it is easier to implement the other features later.
 * A proposal for a scalable server: [[https://docs.google.com/document/d/1uYJXPAWypGgb92QStJ19cW-29y4-hn5hi8oXMR-11TU/edit?hl=en&authkey=CISp9cQB|google docs link]] and [[https://groups.google.com/d/msg/sage-notebook/3jgKp8CWPWI/lby3su8S2fYJ|discussion]]
 * In order to limit connection bandwidth, packet loss, etc. on OSX and FreeBSD, use ipfw. On Linux, apparently you can use netem (search for netem or tc). To limit any connection to 128Kbit/s with a packet loss rate of 10%, (and a queue of 50Kbytes)
{{{
sudo ipfw add pipe 1 ip from any to any
sudo ipfw pipe 1 config bw 128Kbit/s queue 50Kbytes plr 0.1
}}}

If you make the bandwidth smaller, then make the queue smaller. For example, 64Kbit/s might use 10Kbytes queue.
Line 15: Line 50:
 * Robert Bradshaw
 * Jason Grout
Line 16: Line 53:
 * Alex Leon  * Alex Leone
Line 18: Line 55:

== Motivation ==

[[http://xkcd.com/844/|{{http://imgs.xkcd.com/comics/good_code.png}}]]

Preamble

The notebook was written with the intent of being a local GUI, and as a side-effect of taking advantage of the browser, it's usable over the net. But the performance of sagenb.org is terrible since there are around 40,000 users, so we need to make the server much more scalable and robust overall. The plan is currently to tackle this during Sage Days 27.

Scratch documents

Google Code Repositories

Technologies

Tasks

  1. Write testing code to identify bottlenecks, and generally improve robustness.
  2. Convert notebook data structures to a database architecture to allow for concurrent scalable access to a centralized data store by different processes.
  3. Rewrite twist.py to use flask. The notebook will then depend on Flask and no longer use Twisted. The main advantage to using flask is the excellent support for mod_wsgi.

  4. Use mod_wsgi and Apache (say) to make the server scale massively.

Notes

  • Be aware of the sage-devel discussion.

  • It is OK if the "highly scalable notebook server" has dependencies that the usual single (or small group) server doesn't have. E.g., depending on Apache and a database (like MongoDB) would be just fine, even though neither will be included in Sage.
  • It is, of course, important that the notebook still have a zero configuration mode, where it works fine for a small number of users, but without any complicated dependencies.
  • We are not going to shoot too high with this project. In particular, our goals do *not* include adding new authentication systems or making it easy to organize worksheets into folders, etc. We just want to solve exactly one problem: make the notebook scalable. This is of course by far the biggest bug with the Sage notebook, and it is only an issue because the notebook is used so heavily. That said, it will be useful to think through how to implement everything on the wishlist before deciding on how to implement scalability, so it is easier to implement the other features later.
  • A proposal for a scalable server: google docs link and discussion

  • In order to limit connection bandwidth, packet loss, etc. on OSX and FreeBSD, use ipfw. On Linux, apparently you can use netem (search for netem or tc). To limit any connection to 128Kbit/s with a packet loss rate of 10%, (and a queue of 50Kbytes)

sudo  ipfw add pipe 1 ip from any to any
sudo ipfw pipe 1 config bw 128Kbit/s queue 50Kbytes plr 0.1

If you make the bandwidth smaller, then make the queue smaller. For example, 64Kbit/s might use 10Kbytes queue.

People

  • Tom Boothby
  • Robert Bradshaw
  • Jason Grout
  • Mike Hansen
  • Alex Leone
  • William Stein

Motivation

http://xkcd.com/844/