1163
Comment:
|
6753
|
Deletions are marked like this. | Additions are marked like this. |
Line 16: | Line 16: |
* Overview of single-cell notebook project | * Overview of the simple-db-compute project and its architecture |
Line 18: | Line 18: |
* DrakeSageGroup webpage * Google code repository |
* [[DrakeSageGroup|DrakeSageGroup webpage]] * [[http://code.google.com/r/jasongrout-db-compute/|Google code repository]] * [[http://groups.google.com/group/sage-notebook|Sage Notebook]] mailing list -- subscribe to this |
Line 21: | Line 22: |
* install Python, mercurial, mongodb, necessary python modules * Google code accounts * fork simple-db-compute repository |
* install Sage or [[http://www.python.org/|Python]], [[http://ipython.scipy.org/moin/|ipython]], [[http://mercurial.selenic.com/|mercurial]] * install [[http://www.mongodb.org/|mongodb]] * install [[http://api.mongodb.org/python/1.9%2B/index.html|PyMongo]] and [[http://flask.pocoo.org/|Flask]] python modules: {{{ # from within python from setuptools.command import easy_install easy_install.main(["flask"]) easy_install.main(["pymongo"]) }}} * configure mercurial: put this in your ~/.hgrc file {{{ [ui] username = YOUR NAME <YOUR EMAIL> [extensions] record= convert= hgext.mq= hgext.extdiff= hgk= transplant= fetch= }}} * Create a [[http://code.google.com/|Google code]] account * clone the [[http://code.google.com/p/simple-python-db-compute/|simple-db-compute]] repository (either just clone it locally, or clone it on google code and then pull from your clone) |
Line 25: | Line 46: |
* familiarize yourself with the simple-db-compute source code | |
Line 26: | Line 48: |
* add necessary files to get this running on Windows (for example, a .bat file to start mongodb) * Look at making the device more parallel/scalable. See [[http://docs.python.org/library/multiprocessing.html|multiprocessing]] (which includes functionality for pools of worker processes), or maybe use the [[http://hg.sagemath.org/sage-main/file/tip/sage/parallel|parallel code from Sage]]. The new experiments in [[https://groups.google.com/forum/#!topic/sage-devel/64dI4v_wtz0|forking Sage to start it up]] also seem relevant. * Next meeting time == Projects == Here are some project ideas for simple-db-compute, along with some hopefully helpful pointers to resources. * Make flask assign a random computation id to each incoming request, return that computation id to the browser as a return value in the ajax call. Then make the browser keep asking the server for the results of that particular computation id. When results exist, send the result back to the browser. Otherwise, send back some message that the results are not computed yet. * Write a test script that will hammer the site hard to test how scalable it is. Maybe the python multiprocessing module could be used to make a number of workers in a pool and each worker submit computations to the website at a configurable rate. Record the time for roundtrip in a request and the time for a computation to appear. See if the site scales to hundreds of requests each minute or so. Maybe [[http://tsung.erlang-projects.org/|Tsung]] or one of the tools [[http://www.opensourcetesting.org/performance.php|here]] is a good way to do this, rather than writing our own. * On the backend, write a device that keeps a pool of workers (maybe using the python multiprocessing library) and keeps those workers busy with computations from the database. Ideally, the polling of the database should not be blocked by worker computations. Instead, on each poll, workers that are finished should have output put into the database, and new computations should be pulled out of the database. It seems like a good idea to avoid trying to put each output in the database as it happens. Rather, batch up the database updates to happen once in a polling interval. * Configure apache, nginx, or lighttpd to serve up simple-db-compute. Test its scalability compared to the default really simple python http server. It looks like nginx with uwsgi might be a interesting option to explore. * Figure out what needs to happen to get this all working on Windows and write up a "Getting started to developing simple-db-compute on Windows" page. For example, make a start_mongo.bat file or something. * Output Streams * Write a Python library to make output "streams" which could represent different objects. For example, one stream could be a stdout (text), while another stream could be html code. The workers can call the functions to make a new stream of a specific type. The function inserts into the stdout some marker indicating a new stream is starting. The device recognizes that marker and inserts the stream information into the database. The web front end also recognizes the streams and has special code to handle each type (for example, text streams are put inside of a <pre>, while html streams are just added to the document, maybe inside of a div. For a longer explanation of this, see the [[Notebook design]] page. * Make the new_stream functions recognize any files created and automatically make new streams for each file (this may involve copying the files to a temporary directory so they can be inserted into the database). Make flask be able to create a URL resource for each file, which fetches the file data out of the database and sends it to the browser when needed. Again, see design walkthrough in [[Notebook design]]. == Projects that are done == These projects are done. They may still be able to be improved, though. * DONE: Make the web interface use AJAX to send a computation and display a result. Helpful resources: [[http://api.jquery.com/jQuery.ajax/|Jquery AJAX]], google for numerous jquery ajax tutorials. You will probably want to create a javascript file in the static/ directory and add it to templates/base.html (follow the examples already there adding jquery, for example). I would suggest using something like JSON to send and receive messages with the server. Maybe using long polling (see https://github.com/RobertFischer/JQuery-PeriodicalUpdater/ for example) |
Drake Sage Group
This page documents activities of the Drake University Sage group.
Our initial work is on a single-cell compute server, which basically is a webpage that can execute an arbitrary block of Sage code.
For more information, please contact Jason Grout at jason#[email protected] (replace the # with a .)
03 Feb 2010
Meet in Howard Hall 308 at 2pm (room reserved from 1:30-3, so come early if you want).
Agenda
- Introductions
- What Sage is
- Overview of the simple-db-compute project and its architecture
- Resources
Sage Notebook mailing list -- subscribe to this
- Installfest--get the simple compute server up and running on as many people's computers as possible
install mongodb
install PyMongo and Flask python modules:
# from within python from setuptools.command import easy_install easy_install.main(["flask"]) easy_install.main(["pymongo"])
configure mercurial: put this in your ~/.hgrc file
[ui] username = YOUR NAME <YOUR EMAIL> [extensions] record= convert= hgext.mq= hgext.extdiff= hgk= transplant= fetch=
Create a Google code account
clone the simple-db-compute repository (either just clone it locally, or clone it on google code and then pull from your clone)
- First goal of project
- familiarize yourself with the simple-db-compute source code
- add a "compute id" that is returned to the user. The answers page then queries for just that computation's result.
- add necessary files to get this running on Windows (for example, a .bat file to start mongodb)
Look at making the device more parallel/scalable. See multiprocessing (which includes functionality for pools of worker processes), or maybe use the parallel code from Sage. The new experiments in forking Sage to start it up also seem relevant.
- Next meeting time
Projects
Here are some project ideas for simple-db-compute, along with some hopefully helpful pointers to resources.
- Make flask assign a random computation id to each incoming request, return that computation id to the browser as a return value in the ajax call. Then make the browser keep asking the server for the results of that particular computation id. When results exist, send the result back to the browser. Otherwise, send back some message that the results are not computed yet.
Write a test script that will hammer the site hard to test how scalable it is. Maybe the python multiprocessing module could be used to make a number of workers in a pool and each worker submit computations to the website at a configurable rate. Record the time for roundtrip in a request and the time for a computation to appear. See if the site scales to hundreds of requests each minute or so. Maybe Tsung or one of the tools here is a good way to do this, rather than writing our own.
- On the backend, write a device that keeps a pool of workers (maybe using the python multiprocessing library) and keeps those workers busy with computations from the database. Ideally, the polling of the database should not be blocked by worker computations. Instead, on each poll, workers that are finished should have output put into the database, and new computations should be pulled out of the database. It seems like a good idea to avoid trying to put each output in the database as it happens. Rather, batch up the database updates to happen once in a polling interval.
- Configure apache, nginx, or lighttpd to serve up simple-db-compute. Test its scalability compared to the default really simple python http server. It looks like nginx with uwsgi might be a interesting option to explore.
- Figure out what needs to happen to get this all working on Windows and write up a "Getting started to developing simple-db-compute on Windows" page. For example, make a start_mongo.bat file or something.
- Output Streams
Write a Python library to make output "streams" which could represent different objects. For example, one stream could be a stdout (text), while another stream could be html code. The workers can call the functions to make a new stream of a specific type. The function inserts into the stdout some marker indicating a new stream is starting. The device recognizes that marker and inserts the stream information into the database. The web front end also recognizes the streams and has special code to handle each type (for example, text streams are put inside of a <pre>, while html streams are just added to the document, maybe inside of a div. For a longer explanation of this, see the Notebook design page.
Make the new_stream functions recognize any files created and automatically make new streams for each file (this may involve copying the files to a temporary directory so they can be inserted into the database). Make flask be able to create a URL resource for each file, which fetches the file data out of the database and sends it to the browser when needed. Again, see design walkthrough in Notebook design.
Projects that are done
These projects are done. They may still be able to be improved, though.
DONE: Make the web interface use AJAX to send a computation and display a result. Helpful resources: Jquery AJAX, google for numerous jquery ajax tutorials. You will probably want to create a javascript file in the static/ directory and add it to templates/base.html (follow the examples already there adding jquery, for example). I would suggest using something like JSON to send and receive messages with the server. Maybe using long polling (see https://github.com/RobertFischer/JQuery-PeriodicalUpdater/ for example)