Google Summer of Code 2010
This is the main organization page for the Google Summer of Code efforts of the Sage project.
Our proposal as a mentor organization was not accepted.
Contents
Introduction
Sage is an open-source mathematical software system which combines numerous packages under one umbrella with the goal to provide an alternative to major proprietary mathematical software systems (aka the Ma*'s). The software included in Sage use many different languages such as C, C++, Fortran. However, the Sage library which provides a unified interface to these components besides implementions of novel mathematical algoritms is written in Python and Cython. Sage also includes a web-based user interface where worksheets are stored for each user.
With it's friendly development community and diverse challenges including
- linking together software systems intended to be used through a command line interface,
- efficient implementation of novel mathematical ideas,
- making sure all the components build without problems on a wide range of platforms, and
- providing a web-based user interface for easy experimentation and collaboration in mathematics
Sage provides projects that might appeal to contributors with different interests and skill levels.
If you're a student interested in working on any of the projects described below, note that these are mostly rough ideas. Feel free to ask questions or suggest other projects by writing to [email protected]. Here is the student application template we recommend students to use for their application.
If you're a Sage developer, please take some time to organize the list below and add more ideas. The notes section contains some guidelines from the GSOC FAQ. These projects should be doable with less than 3 months of full time work. Projects should generally have (copied from the KDE list):
- a brief explanation
- the expected results
- pre-requisites for working on the project
- names of possible mentors, information on how to contact them
Important Dates
Here is the original timeline. Some highlights:
March 8 - 12: application window
March 18: accepted mentoring organizations announced
March 29 - April 9: student application window
- ...
Projects Ideas
All #numbers below refer to trac tickets.
Notebook
The Sage notebook is an AJAX application similar to Google Documents that provides a user interface to mathematical software (not just Sage, also including Singular, GAP, Pari, Maxima, R, KASH etc.) somewhat like Mathematica notebooks. It was written from scratch (in Javascript and Python) by the Sage development team, and has been used daily by thousands of people over the last year. A number of universities have deployed Sage notebook servers for significant student use. See here for a list of Sage notebook servers. The notebook is one of the killer features of Sage.
This project is about improving the notebook. No special mathematical knowledge is required. Knowledge of Javascript, jQuery, Python, and general AJAX techniques is needed.
Authentication backend
The Sage notebook is used in many universities. Using the already existing user database and authentication facilities of the university for signing in to the Sage notebook is an often requested feature. There have been attempts to integrate LDAP and KERBEROS (#4309) authentication before.
Slideshow mode
This project would involve adding a slideshow mode to the Sage notebook. Since Sage worksheets run in any modern web browser and already support nice jsmath typesetting, TinyMCE text editing, and of course computation, an attractive and usable slideshow mode would be extremely useful for anyone giving a presentation that involves some kind of computing.
See #6342 for a previous attempt to implement this before the notebook became a separate component.
Potential mentor: William Stein
Export to something printable
Enhance export capabilities of Sage worksheets: create methods for well designed PDF, LaTeX (with or without SageTeX) or ODF output. This would be an inverse of sorts of Rob Beezer's tex2sws work, which takes TeX files and converts them into Sage worksheets.
Community tools
Related to enhanced publishing and export abilities, we would like the Sage notebook to have better community/social tools. Projects for this could involve:
Enhance publishing of Notebook documents (i.e. like on http://www.sagenb.org/pub).
- Wiki-like platform for editing notebooks for publishing mathematical, physical, statistical and other content.
- tagging support, listings by tags
- efficiently exchange usage examples, tips and ideas.
- a repository of useful examples, interacts, etc., distributed with Sage and hosted on a website, that users can query and easily use/adapt.
Internationalization of the notebook
This project would involve changing the Sage notebook so that the user interface language can be translated and changed on the fly. This project will require knowledge of Python, Mercurial, and basic web coding; knowing the GNU gettext utilities, Javascript, and the Jinja web templating system will be helpful. No knowledge of (human!) languages other than English is necessary.
Currently, the user interface for the Sage notebook is all in English. Several one-off translations have been done (Korean; Russian) these translations involved going through source code and translating each string individually. The goal of the Sage project is to produce a viable alternative to Maple, Mathematica, Magma, and Matlab; having the user interface available in non-English languages would have a tremendous impact on that goal and vastly increase the number of people who can benefit from Sage.
Proper internationalization (i18n) involves wrapping each string in a function that looks up the correct translation, depending on the current language selected.
Deliverables for this project would include:
- wrapping strings in the Python, Javascript, and templating code of the Sage notebook with appropriate translation calls;
- updating the Sage notebook so that the language can be set to a default language (so that, say, a French site could have everything default to French) and can be changed on a per-user basis (so each user can choose a preferred language);
- developing a workflow for adding new translations and updating existing translations when strings change.
It would also be nice to work on support for more significant localization, perhaps using the Python Babel tools; this would include more thorough localization abilities, such as proper pluralization, thousands/decimal separators, ordinals, date and time display, and so on.
This project will not involve any actual translation, just making it possible for the Sage notebook UI to be localized. This is probably a medium-difficulty project, and will not require any specialized knowledge of mathematics or mathematical programming.
Potential mentor: DanDrake
Other ideas for the notebook
Here is a list of other possible improvements to the Sage notebook. Feel free to explore these ideas and make them into a proper project. The Sage tasks list also contains many other similar items. You can email the Sage notebook discussion group if you are interested in working on these.
Improvements to @interact
- a 2d locator widget (there is already a separate bounty for this, around $100 USD or so, I think)
- flexible layout of controls, and interacts within interacts
- master-worksheet, collection of other worksheets for a script or book.
- Permanent hyperlinks between worksheets, independent of worksheet numbering, to support multi-worksheet documents (ie books)
- enhance history and snapshot capabilities.
- concurrent editing of one single document: only altered cells are updated and "collision" warnings issued if more than one change happens with appropriate methods to solve it
- read/write permission management for groups with roles (teacher is able to read notebooks, but students are not able to read each others)
- basic datasheets (simple Google Docs-like spreadsheets) that can be shared with worksheets in read-only or read-write mode, could be editor to SQLite tables.
- content-editable divs for the code cells, so we can support javascript widgets for inputting code (for example, we could have a slider right in the code cell, representing a numeric value, or a javascript graph editor representing a graph, or a wysiwyg formula editor, or color syntax highlighting)
- support various types of text cells, so that, at the user's option, we could have a ReST text cell, an HTML (TinyMCE) text cell, a plain text cell, a latex text cell, etc.
- ...
Interfaces to/from Sage
Sage provides a unified interface to lots of different specialized mathematics software. This is all due to the various interfaces written by Sage developers. The sophistication level of these varies greatly, from passing messages on a text terminal and parsing results (see interpreter interfaces in the reference manual) to dynamically importing functions defined in an interpreted language from Singular and converting Sage types to Singular data structures on the fly (see libraries section of Sage-4.3.2 release tour).
Sage as a C library
Besides being a distribution of the best open source mathematical software, Sage is also a library for implementing mathematical structures and experimenting with new algorithms. At the moment, this library is only usable from Python. Since Python can be embedded in C/C++, it is possible to call any Sage function from other (mathematics) libraries/software. This is especially interesting since the capabilities of Sage is increasing very rapidly and it includes very efficient (if not the fastest) implementations of many basic algorithms (for example Hermite normal form).
Make it easier to call Sage from other applications. Example code here:
Potential mentor: Burcin Erocal
libGAP
Potential mentor: William Stein
Google Wave robot
Support for sessions across multiple input lines, graphics, images, 3d graphics, collaborative work, authentication via a login, etc. On Sage's side there needs to be an implementation of the federation protocol to communicate with the wave server - or routed through app-engine. A similar approach is wolfram alpha in google wave.
Portable C99 libm
Sage relies on a fairly complete C99 libm. In particular, it expects the "long double" and "complex" variants of most functions to be present. Not all these functions are present on Cygwin, FreeBSD or older Solaris, causing porting problems on those platforms. The objective of this task would be to either locate and port or write a libm that is sufficient to meet Sage's requirements.
One possible option would be to use glibc and only compile the libm bits. (Thought glibc is a bit dodgy on the precision side in some areas).
Another possible thing to look at is http://lipforge.ens-lyon.fr/www/crlibm/index.html.
Potential mentor: Peter Jeremy
pynac
As the symbolics backend, Pynac is a fundamental component of Sage. With some work and optimization, it could also be used for arithmetic with other mathematical structures like generic polynomial rings. It's based on a solid library GiNaC, which has great documentation and very readable code.
Skills: C++, (the necessary Cython and Python can be picked up easily)
Potential mentor: Burcin Erocal
Optimization / better data structures
This project would have two steps, the first would be a major optimization for Pynac also a good introduction to the library, coding conventions, type hierarchy, etc. The second would involve replacing the basic datatypes (vectors with heaps), lot's of timings and experiments to improve performance.
- allow different printing orders
- GiNaC uses the Maple approach to printing symbolic expressions. Variable orders depend on their creation order at runtime. This is not acceptable for Sage, so we (=Burcin) added some code to do an approximation to a graded lexicographic order. This is the wrong design, and it slowed things down considerably. Doing it the right way, and allowing different orders would also let us use Pynac for other structures in Sage.
- consider heaps instead of vectors for storage of add and mul objects
Data structures for polynomials is a well studied topic. Geobuckets (used by Singular, or hash tables have been used successfully until now. Recently heaps, based on work by Stephen Johnson in the 70s, were shown to perform much better than the alternatives.
Pynac uses vectors to keep add and mul objects. Here is a detailed explanation of the data structure. Using heaps could lead to a much more efficient data structure allowing us to handle much larger expressions. This would mean a major restructuring of the basic data types in Pynac.
(Geobuckets) http://dx.doi.org/10.1006/jsco.1997.0176
(Johnson's paper) http://doi.acm.org/10.1145/1086837.1086847
Pretty printing
Development Process
The Sage development process (detailed here) involves posting patches on our trac server, getting them reviewed, and then merged into our source code. Then the code is compiled and tested. Right now, many of these steps are done manually and could be automated and improved. Possible ideas for projects in this area would involve:
- setting up a buildbot with trac integration for tickets with patches (list failing doctests, ...)
set up a code review system like Rietveld.
- ...
Porting
Sage is mostly "native" to Linux, OS X & Solaris. A Cygwin port is underway (it may be working by this summer). But porting to other systems is very useful, both to increase our potential user base, and to help eliminate bugs (see the section "Computational Technique; the Pentium Flaw" in this paper). This project would involve assisting our ports to the following platforms:
- AIX
- Cygwin
- FreeBSD
- HP-UX
- Solaris 10 64-bit (the 32-bit version is complete).
- Open Solaris on x64 hardware.
- Solaris 10 on x86
It could also involve improvements to the build system -- can we use the same system on different platforms, including Windows? Skills required: knowledge of porting software to new compilers and platforms.
Another idea in this vein would be setting up a Launchpad PPA for Sage. The Ubuntu packages for Sage are hopelessly out of date, and because of the large and heterogeneous nature of Sage, packaging for various Linux distributions is quite difficult. Having Ubuntu packages available would be very useful and possibly help us get Sage packaged for other distributions.
Potential mentor: Peter Jeremy, William Stein
Others
Here are some other task lists:
Potential Mentors
- Burcin Erocal
- Peter Jeremy (FreeBSD port, possibly libm task)
- William Stein
Notes:
We should take care to define deliverables for the items below. These should be doable with less than 3 months of work.
Here is what the FAQ says for "Ideas" lists:
- An "Ideas" list should be a list of suggested student projects. This list is meant to introduce contributors to your project's needs and to provide inspiration to would-be student applicants. It is useful to classify each idea as specifically as possible, e.g. "must know Python" or "easier project; good for a student with more limited experience with C++." If your organization plans to provide an application template, you should include it on your Ideas list.
- Keep in mind that your Ideas list should be a starting point for student applications; we've heard from past mentoring organization participants that some of their best student projects are those that greatly expanded on a proposed idea or were blue-sky proposals not mentioned on the Ideas list at all.
And this is from the notes on organization selection criteria:
- 2) Do the projects on your ideas list look feasible for student developers? Is your ideas list thorough and well-organized? Your ideas list is the first place that student participants are going to look to get information on participating in GSoC, so putting a lot of effort into this list is a good thing(tm). One thing we noticed and really appreciate is how some organizations classified their ideas by easy, medium and difficult, and specifically listed the skills and background required to complete a given task. It might also be cool to expand on each idea with some places to get started research-wise (pointers to documentation or specific bugs), as well as the impact finishing a given idea will have for the organization.