On development models for sharing (experimental) code

One core aim of Sage is to foster code sharing, and to encourage groups of researchers, teachers, and other users to get together to develop new features they need either on top of or within Sage, and share them.

Over the years, many development workflows have been experimented by various groups of people to improve Sage in certain areas, like Sage-Combinat for (algebraic) combinatorics, Sage-Words for combinatorics on words, SageManifolds for differential geometry, purple-sage for number theory, ...

The goal of this document is to discuss the different workflows that have been tried with their pros and cons, to share best practices and to brainstorm about what support and recommendations Sage could provide for various use cases. Eventually, this could become a section of the developer's manual (though this can be of interest for other people wanting to start sharing code without necessarily contributing to Sage), or a page of the sagemath.org website.

At this point this is a collection of notes by N. ThiƩry; please hack in and contribute your own vision!

See also:

Objectives of a development workflow

Of course the milleage will vary from project to project, but the objectives of a development workflow can typically be to:

1. Support fast paced development within a group of users working on

2. Support rapid dissemination of experimental features.

3. Foster high quality code by promoting documentation, tests, code reviews.

4. Foster intrinsic high quality code by providing an *ecosystem*

5. Strike a balance between centralized and decentralized.

6. Minimize *maintenance* overhead, and in particular code rotting.

7. Remain flexible between the all-in-one versus packages development models

8. Promote extending existing Sage classes and modules with additional features.

Existing workflows

Direct integration into Sage

In this workflow, each feature is shared by integrating it directly into Sage.

Pros:

- Simplicity for the user: all stable features are directly available in Sage - Simplicity for Sage developers: no additional workflow to learn - No need to worry about release, distribution, test infrastructure, ... - Promotes early integration of code and objective 3 - Makes objective 8 straightforward

Cons:

- Limited support for objective 2 - Slows down the development: once a feature is in Sage, any change

- Getting the latest feature forces updating to the latest version of Sage - Introduces a bias toward code bloat (in doubt, features tend to be added to Sage) - When development is faster than reviews, the maintenance effort in having many open tickets gets heavy when minor changes to an early ticket has to be merged into all later ones.

Examples:

- SageManifolds <http://sagemanifolds.obspm.fr/>_, cf. the metaticket#18528 <http://trac.sagemath.org/ticket/18528>_ - ACTIS: Algebraic Coding Theory for Sage <http://bitbucket.org/lucasdavid/sage_coding_project/wiki/Home>_, cf. the metaticket #18846 <http://trac.sagemath.org/ticket/18846>_

Discussion:

- Soften model using external repo: In the beginning of ACTIS (see above), we maintained a public clone of Sage on Bitbucket where each major feature set was a branch. Once our main design was mature enough, the first few branches were made into Trac tickets and merged in Sage. This fully achieved objective 2 and 4 in this phase. When choosing the scope of a branch, attention was given to minimising dependencies, easing the maintenance burden of parallel development. However, extracting tickets from branches was manual and error-prone, and changes done in the trac review phase were annoying to port back to the public repo. So after the most volatile period of design, we abandoned this model. - Use the @experimental decorator to mitigate the backward compatibility issue while the code is not yet fully mature. The decorator is a bit clumsy to use due to doc-testing in Sphinx (tricks need to be done to avoid printing the experimental warning on each doc-test), see e.g. AsymptoticRing <http://doc.sagemath.org/html/en/reference/asymptotic/sage/rings/asymptotic/growth_group.html>_.

Experimental feature branches

In this workflow, experimental feature or feature sets are implemented as branches on the Sage sources.

Pros:

- Makes objective 8 straightforward - Encourages integration into Sage - Development history is automatically kept upon integration into Sage

Cons:

- Branch needs to be regularly updated to prevent code rotting due to

- Objective 2 requires basic git knowledge from end-users - Lack of modularity for objective 2: due to potential conflicts, it's not easy

- Cherry picking certain mature features for integration in Sage is

- It's hard to strike the right granularity in terms of feature /

- Because of the above, this workflow does not work well for objective 4 - Introduces a bias toward the all-in-one development model

A proposal Features Trac:

To promote this work flow, we may have a public features trac similar to the current development trac. A ticket in the features trac keeps a Git feature branch that provides a feature, that is, special functionality that can be merged to the Sage core at the user's build time. The ticket is not reviewed (in the development trac sense) and its branch is not supposed to be merged to Sage. The user can select a feature set at his/her build time. Then "make-sage-with feature_set" will fetch the feature branches from the features trac and merge them with the master branch of the Sage core and start to make in the usual way.

The features trac can provide

- Description about the feature - Info about authors or maintainers. - Info about the latest Sage release with which the feature works. We may have "feature-bots" for doctesting. - Info about dependencies, that is, other features that this feature depends on. - Link to the host at which actual development occurs, like Github repos. - Branch name in a standardized format. Example: "feature/klee/rings/super_field" where "klee" is the author's id.

Some remarks:

* The feature branch should contain source code and documentation. The documentation may have links toward the documentation of the Sage core but not vice versa. After build, the user will have a single documentation as usual. * The features in the features trac are either orthogonal or competent to other features in their functionality. * Some parts of the present Sage library may be turned into features. For example, we may have "feature/sage/modular/abvar".

Patch queue as used by Sage-Combinat between 2009 and 2013

See also the bottom of this page.

TODO: description

This section is just for reference: there used to be a strong rationale for this workflow with the former Sage development workflow and a given context. But not any more.

Pros:

- Relatively good for objective 1 (except for objective 6) - Relatively good for objective 2 (thanks to "sage -combinat install"), except

- Objective 8 is straightforward

Cons:

- Complexity of working at the meta level (version control on the patches) - Really bad at objective 6: Horrible maintenance overhead due to syntactic conflicts

- Introduces a strong bias toward code death, or at least non integration into Sage - Monolithic: one could not use several patch queues at once, so this

Standalone (pip) packages

Here the idea is to implement feature sets as independent Python packages on top of Sage. Converting a bunch of Python files into such a package to make it `easy to install <http://python-packaging-user-guide.readthedocs.io/en/latest/distributing/>`_ is straightforward.

Examples:

- Template for creating Sage packages <https://github.com/cswiercz/sage_packages>_ - Modular Abelian Varities <https://github.com/williamstein/sage_modabvar>_ - Python implementation of chebfun <https://github.com/cswiercz/pychebfun>_ - Purple Sage <https://github.com/williamstein/psage>_ - slabbe-0.2.spkg <http://www.slabbe.org/blogue/categorie/slabbe-spkg/>_

NON-Examples:

- SageManifolds <http://sagemanifolds.obspm.fr/>_

- CHA <https://bitbucket.org/nborie/cha>_ "It is recommended to use the more recent implementation from the branch attached to this ticket rather than this library."; I think this is just some code to copy into the sage library or run directly, with no package support at all.

Pros:

- Good for objectives 1, 2, 4

Cons:

- Handling of compatibility with various versions of the dependencies (in particular Sage) - Risk of code rotting (as Sage evolves over time) or death (if it's not maintained) - Requires coordination with Sage and related packages to not step on each other

Standalone (pip) packages with an integration mission

This is a variant on the previous development workflow, with an explicit focus on easing (or even promoting) the integration of mature code into Sage.

Specifics:

- Layout the code as in the Sage library, with top module called

- Use recursive monkey patching to insert all the code dynamically in

Examples:

- Sage-semigroups <https://github.com/nthiery/sage-semigroups/>_ (quite preliminary!!!)

Pros:

- Same as above - Objective 8 is straightforward - Lighter maintenance overhead compared to branches or patch queues:

- The integration of mature code into Sage helps for objective 3 and for the

- Depending on how strongly one pushes toward the integration of

Cons:

- The concept has not yet been really battlefield tested! - Moving code into the Sage library is done by copy pasting. This

What is this Sage-Combinat queue madness about???

Sage-Combinat is a software project whose mission is: "to improve the open source mathematical system Sage as an extensible toolbox for computer exploration in (algebraic) combinatorics, and foster code sharing between researchers in this area".

In practice it's a community of a dozen regular contributors, 20 occasional ones and, maybe, 30 users. They collaborate together on a collection of experimental patches (i.e. extensions) on top of Sage. Each one describes a relatively atomic modification which may span several files; it may fix a bug, implement a new feature, improve some documentation. The intent is that most of those extensions get integrated into Sage as soon as they are mature enough, with a typical life-cycle ranging from a few days to a couple months. In average 20 extensions are merged in each version of Sage (42 in Sage 5.0!), and more than 200 are under development.

Why do we want to share our experimental code

Here are our goals in using the Sage-Combinat queue for sharing patches:

- Preintegration

- Pair programming (or more than pair!)

- Easy review even with many dependencies

- Maturation

- Overview of what's developed by who

- Sharing code with beginner colleagues

== What are our constraints ==

Some random questions

Foreseeable future

CodeSharingWorkflow (last edited 2017-02-06 20:13:41 by mrennekamp)