Cython

Jeroen Demeyer (Universiteit Gent)
 
Disclaimer: I am not a Cython developer!

What is Cython?

Uses of Cython in Sage

Within Sage, these are the applications of Cython:

Getting started

To quickly experiment with Cython, the easiest is to use the cython() function:

{{{id=1| %time cython(""" print "Hello from Cython" """) /// }}}

Note that this takes a long wall time, because this code needs to be compiled.

Let's implement trial division in Python:

{{{id=6| def trialdivision_py(n): i = Integer(2) while i*i <= n: if n % i == 0: return i i = i+1 /// }}}

And now exactly the same code in Cython:

{{{id=17| cython(""" from sage.rings.integer import Integer def trialdivision(n): i = Integer(2) while i*i <= n: if n % i == 0: return i i = i+1 """) /// }}}

Just compiling this code in Cython already gives a modest speed-up:

{{{id=18| n = next_prime(10^7)*next_prime(10^8) /// }}} {{{id=20| %time trialdivision_py(n) /// }}} {{{id=19| %time trialdivision(n) /// }}}

Using C types

The previous example was compiled in Cython, but was still using the Python API. We can speed up things massively using native C types:

{{{id=21| cython(""" def trialdivision(unsigned long long n): cdef unsigned long long i = 2 while i*i <= n: if n % i == 0: return i i = i+1 """) /// }}} {{{id=22| %time trialdivision(n) /// }}}

Interrupts

Unfortunately, this code cannot be interrupted with CTRL-C or alarm():

{{{id=26| %time try: alarm(0.1) trialdivision(next_prime(10^17)) except KeyboardInterrupt: pass /// }}}

To solve this problem, we add a sig_check() inside the loop (much more about this later!)

{{{id=28| cython(""" def trialdivision(unsigned long long n): cdef unsigned long long i = 2 while i*i <= n: sig_check() if n % i == 0: return i i = i+1 """) /// }}} {{{id=29| %time try: alarm(0.1) trialdivision(next_prime(10^17)) except KeyboardInterrupt: pass /// }}}

Compiler directives

Division by zero is caught as usual when executing this:

{{{id=9| cython(""" def trialdivision(unsigned long long n): cdef unsigned long long i = 0 # MISTAKE! while i*i <= n: sig_check() if n % i == 0: return i i = i+1 """) /// }}} {{{id=35| trialdivision(n) /// }}}

This check for division-by-zero (and also the different rounding when dividing negative numbers) causes some runtime overhead.

To get pure C division, use the cdivision=True compiler directive:

{{{id=41| cython(""" cimport cython @cython.cdivision(True) def trialdivision(unsigned long long n): cdef unsigned long long i = 0 # MISTAKE! while i*i <= n: sig_check() if n % i == 0: return i i = i+1 """) /// }}} {{{id=38| trialdivision(n) /// }}}

The default in Cython is cdivision=False, but it is cdivision=True for Sage source code.

There are similar directives for list indexing: boundscheck and wraparound.

More about type declarations

The code with unsigned long long is limited to 64 bits:

{{{id=43| trialdivision(2^100-1) /// }}}

We will now write the trial division code using Sage's Integer type:

{{{id=46| cython(""" from sage.rings.integer cimport Integer def trialdivision(Integer n): cdef Integer i = Integer(2) while i*i <= n: sig_check() if n % i == 0: return i i = i+1 """) /// }}}

Just this change doesn't speed up anything:

{{{id=48| %time trialdivision(n) /// }}}

We are no longer limited to 64 bits:

{{{id=60| trialdivision(2^100) /// }}}

Note the cimport statement. This statement looks at the file src/sage/rings/integer.pxd. That file contains declarations of the Integer type. It should contain all declarations of cdef, cpdef functions and attributes which are meant to be used by different modules.

Let us optimize the addition i = i+1 by using the cdef method _add_long:

{{{id=49| cython(""" from sage.rings.integer cimport Integer def trialdivision(Integer n): cdef Integer i = Integer(2) while i*i <= n: sig_check() if n % i == 0: return i i = i._add_long(1) """) /// }}} {{{id=50| %time trialdivision(n) /// }}}

Integers in Sage are implemented using the MPIR library (a fork of GMP). Note the cdef attribute cdef mpz_t value. Here, mpz_t is some C type defined by MPIR. We optimize the divisibility check using the low-level MPIR function mpz_divisible_p. First, we need to cimport that function (I'm lazy so I just cimport everything from sage.libs.gmp.mpz).

{{{id=55| cython(""" from sage.libs.gmp.mpz cimport * from sage.rings.integer cimport Integer def trialdivision(Integer n): cdef Integer i = Integer(2) while i*i <= n: sig_check() if mpz_divisible_p(n.value, i.value): return i i = i._add_long(1) """) /// }}} {{{id=56| %time trialdivision(n) /// }}}

Type casts

The construction <typename>obj can be used to "cast" obj to typename. This is literally a cast like (typename)obj in C. There is also <typename?>obj which is a checked cast: it raises TypeError if obj is of the wrong type. Note the difference between

{{{id=95| cython(""" from sage.rings.integer cimport Integer cpdef Integer nocast(x): return x cpdef Integer cast(x): return x cpdef Integer checked_cast(x): return x cpdef Integer convert(x): return Integer(x) """) /// }}} {{{id=101| nocast(5) /// }}} {{{id=102| nocast(int(5)) /// }}} {{{id=103| nocast(None) /// }}} {{{id=100| cast(int(5)) /// }}} {{{id=107| type(cast(int(5))) /// }}} {{{id=99| checked_cast(5) /// }}} {{{id=105| checked_cast(QQ(5)) /// }}} {{{id=104| checked_cast(None) /// }}} {{{id=106| convert(5.0) /// }}}

cdef and cpdef functions

In Cython you can define Python functions the usual way using def. You can also define a cdef function, which is compiled as a C function without Python calling overhead. A cdef function cannot be imported, it must be cimported (and declared in the .pxd file). Within one Cython module, there is no difference between these two types of function. There is also cdef inline for inline functions.

There is also a cpdef function, which behaves like a cdef function with a Python wrapper. In the case of methods, a cpdef function can be overwritten by a def method (a cdef method only by a cdef method). That is why a class inheriting from Element can be implemented in Cython or Python and _mul_ will "just work".

{{{id=62| cython(""" from sage.rings.integer cimport Integer cdef inline bint is_divisible(n,d): return n % d == 0 def trialdivision(Integer n): cdef Integer i = Integer(2) while i*i <= n: sig_check() if is_divisible(n,i): return i i = i+1 """) /// }}}

Note the use of the special type bint. In a C context, it behaves exactly like int, but it gets converted to Python as True/False. For the example above, there is no difference, but using bint is more explicit about the meaning.

{{{id=66| trialdivision(n) /// }}}

Extension types

Cython allows you to define a class just like in Python. However, with cdef class, you can define a so-called "extension type". These are much faster than classes, but do not support all features of classes (like arbitrary attribute assignment):

{{{id=67| Integer.foo = 1 /// }}}

Methods can be defined using def, cdef or cpdef. Attributes can be defined using cdef (Cython only), cdef public (Cython + Python), cdef readonly (Cython + read-only in Python):

{{{id=74| cython(""" cdef class MyType(object): cdef str mystr cdef public long myint cdef readonly list mylist def __init__(self, x, y, z): self.mystr = x self.myint = y self.mylist = z def __repr__(self): return "MyType({}, {}, {})".format(self.mystr, self.myint, self.mylist) """) /// }}} {{{id=78| x = MyType("Hello world", 42, range(10)) /// }}} {{{id=91| x /// }}} {{{id=80| x.mystr /// }}} {{{id=85| x.myint /// }}} {{{id=86| x.myint = 43.0 /// }}} {{{id=87| x /// }}} {{{id=88| type(x.myint) /// }}} {{{id=79| x.mylist /// }}} {{{id=84| x.mylist = [] /// }}}

Final notes

{{{id=92| /// }}}