Packaging and distributing SageMath

a Sage days 77 project

Part of the days 77 workshop was dedicated to studying possible improvements to SageMath's package system and availability in popular distributions.

There is a cyclic interest in these issues. Past Sage Days that have also dealt with them were days 4 and days 7.

When talking about modularization, packaging, distribution, etc., Sage devs may mean several different things at the same time:

Although these are separate problems, the interactions are non-trivial.

Package maintainers from several distributions were present, in person, or remotely, at Days 77:

The workshop was the occasion to share knowledge and concerns on packaging SageMath for Linux distributions. Given that some packagers are not involved in SageMath development proper, it was decided to create a separate sage-packaging mailing list exclusively devoted to packaging issues.

Gentoo

At the time of writing, SageMath ebuilds (packages) have existed for Gentoo for six years. The layout follows closely the Sage layout:

They are distributed as an overlay, though some individual ebuilds have made it into the standard distro.

At each new SageMath beta, the Sage on Gentoo overaly is updated, usually in less than 24h. This requires a couple of hours of manual work on average:

Additional notes

Debian

There have been repeated efforts to package SageMath for debian, as shown by this (outdated) wiki page. At some point in 2009, a Debian package existed in stable, installing the monolithic sage in /opt/sage; the package became unmaintained and was eventually removed.

The DebianScience project maintains experimental packages for SageMath, as documented in this wiki page. It also tracks dependency versions discrepancies for Debian vs Sagemath's master branch and develop branch. (source code for the tracker)

At Days 77, Julien and others worked on:

Arch

Antonio Rojas maintains a very up-to-date SageMath package for Arch/community (v7.1 at the time of writing). Arch stastics show that >3% of users (who report package statistics) have the package installed.

SageMath is split into separate packages, and some patches are applied.

Current issues

Note: it would be hard to have a stable package that always work for all Arch environments. There are distributions based on Arch that aim at slower but more stable progress (e.g. Antergos). Aiming for a stable package for such would be more reasonable.

Guix/Nix

Guix and Nix are two very similar distribution agnostic functional package managers. Both projects focus on reproducible builds, in particular building/installing a package twice should give the same result byte-for-byte.

Efforts were started at Days 77 to package SageMath for Guix (Andreas Enge) and Nix (Julian Rüth). Andreas managed to make a Guix recipe for SageMath, and run it, but at the moment it is not up to Guix standards.

Citing Julian: "I am not sure that Nix/Guix will in the end help much with the problem of distributing or developing Sage. Anyway, if we can get to the point where an unpatched Sage builds in the very restrictive setting those two impose, then it should be relatively easy to build for any distribution."

Anaconda

Anaconda is a user-space distribution and package manager for scientific software. Born in the python ecosystem, it is becoming a de facto standard for scientific software.

Anaconda is mostly oriented towards binary packages, though Erik noted that nothing prevents shipping source packages with it. At the moment, no serious experiment with packaging SageMath for Anaconda has been done yet, but there was a consensus at the Days 77 that having SageMath in Anaconda is highly desirable, because of the overlapping interests with the Anaconda community, and because it has the potential to bridge the fracture between pure and applied math communities.

Common obstacles to packaging SageMath for distributions

SageMath as a distribution, candidates to replace the SPKG system

The second topic addressed at Sage Days 77 was internal packaging. SageMath has its own package system (SPKGs), with its pros and cons. Here's some common complaints about spkgs:

The workshop investigated possible alternatives to the spkg system. Two, mostly orthogonal, goals for such a system are:

The two goals are not necessarily achieved by the same system. For example, Anaconda is a very good candidate for the first one, but it does very little for the second (and, potentially, it makes it worse). However, nothing prevents having two systems complementing each other (except that two such systems might not exist yet).

Wanted features for an spkg replacement are:

Desirable features:

The following systems where considered at Days 77: #Pip.2FPyPI, #Anaconda-1, #Guix.2FNix-1, #Gentoo_prefix, #hashdist.

Pip/PyPI

Pip is NOT a package manager. Pip is just a Python module installer, it does very little to help install non-Python dependencies, and is not very smart about version handling.

However, many components of SageMath are stock python modules available on PyPI, and they could be installed by pip install. Work on this is underway, see #20218.

A common wish in the community is that more SageMath components which would be useful outside SageMath be shipped as separate Python/Cython modules on PyPI, so they benefit a larger community. This has recently happend with CySignals, and is happening with the Pari interface.

To some extent, pip+PyPI already offer a way for users to distribute SageMath code via sage -pip. However this is not well documented, and not explicitly supported.

Anaconda

Anaconda is a user-space distribution and package manager for scientific software. Born in the python ecosystem, it is becoming a de facto standard for scientific software.

Its most interesting features are

- Supports Linux, Windows, Mac. - There is a condahub (binstar) where people can submit their packages, and communities create channels (see also Anaconda cloud). - Very advanced dependency handling. Its developers say "Anaconda has solved the packaging problem".

However, as it is put here:

"Of course, solving the packaging problem and removing it are different things. Conda does not make it easier to compile difficult packages. It only makes it so that fewer people have to do it. And there is still work to be done before Conda really takes over the world."

Anaconda being mostly oriented towards binary packages, it does very little to help developers handle a modular distribution such as SageMath (it is possible in principle to package sources for Anaconda, though). Some people have explored options to make it easy to compile complex distributions, while (semi-)automatically generating Anaconda packages. Some pointers here :

although this seems to have stalled for the moment.

To some extent, with respect to its host system, Anaconda is a monolith as much as SageMath, albeit with larger adoption, better integration, and a better packaging system. Among other things, transitioning from spkgs to Anaconda would shift the "monolith blame" from SageMath to Anaconda, which would not be bad.

Guix/Nix

Gentoo prefix

hashdist

Discussions outside Days 77

Roughly in parallel with Days 77, a great deal of discussion on packaging-related topics took place in sage-devel:

It would be extremely useful to summarize these discussion, but a better place for this would be a separate wiki page.