Within Sage, these are the applications of Cython:
To quickly experiment with Cython, the easiest is to use the cython() function:
{{{id=1| %time cython(""" print "Hello from Cython" """) /// }}}Note that this takes a long wall time, because this code needs to be compiled.
Let's implement trial division in Python:
{{{id=6| def trialdivision_py(n): i = Integer(2) while i*i <= n: if n % i == 0: return i i = i+1 /// }}}And now exactly the same code in Cython:
{{{id=17| cython(""" from sage.rings.integer import Integer def trialdivision(n): i = Integer(2) while i*i <= n: if n % i == 0: return i i = i+1 """) /// }}}Just compiling this code in Cython already gives a modest speed-up:
{{{id=18| n = next_prime(10^7)*next_prime(10^8) /// }}} {{{id=20| %time trialdivision_py(n) /// }}} {{{id=19| %time trialdivision(n) /// }}}The previous example was compiled in Cython, but was still using the Python API. We can speed up things massively using native C types:
{{{id=21| cython(""" def trialdivision(unsigned long long n): cdef unsigned long long i = 2 while i*i <= n: if n % i == 0: return i i = i+1 """) /// }}} {{{id=22| %time trialdivision(n) /// }}}Unfortunately, this code cannot be interrupted with CTRL-C or alarm():
{{{id=26| %time try: alarm(0.1) trialdivision(next_prime(10^17)) except KeyboardInterrupt: pass /// }}}To solve this problem, we add a sig_check() inside the loop (much more about this later!)
{{{id=28| cython(""" def trialdivision(unsigned long long n): cdef unsigned long long i = 2 while i*i <= n: sig_check() if n % i == 0: return i i = i+1 """) /// }}} {{{id=29| %time try: alarm(0.1) trialdivision(next_prime(10^17)) except KeyboardInterrupt: pass /// }}}Division by zero is caught as usual when executing this:
{{{id=9| cython(""" def trialdivision(unsigned long long n): cdef unsigned long long i = 0 # MISTAKE! while i*i <= n: sig_check() if n % i == 0: return i i = i+1 """) /// }}} {{{id=35| trialdivision(n) /// }}}This check for division-by-zero (and also the different rounding when dividing negative numbers) causes some runtime overhead.
To get pure C division, use the cdivision=True compiler directive:
The default in Cython is cdivision=False, but it is cdivision=True for Sage source code.
There are similar directives for list indexing: boundscheck and wraparound.
The code with unsigned long long is limited to 64 bits:
{{{id=43| trialdivision(2^100-1) /// }}}We will now write the trial division code using Sage's Integer type:
{{{id=46| cython(""" from sage.rings.integer cimport Integer def trialdivision(Integer n): cdef Integer i = Integer(2) while i*i <= n: sig_check() if n % i == 0: return i i = i+1 """) /// }}}Just this change doesn't speed up anything:
{{{id=48| %time trialdivision(n) /// }}}We are no longer limited to 64 bits:
{{{id=60| trialdivision(2^100) /// }}}Note the cimport statement. This statement looks at the file src/sage/rings/integer.pxd. That file contains declarations of the Integer type. It should contain all declarations of cdef, cpdef functions and attributes which are meant to be used by different modules.
Let us optimize the addition i = i+1 by using the cdef method _add_long:
{{{id=49| cython(""" from sage.rings.integer cimport Integer def trialdivision(Integer n): cdef Integer i = Integer(2) while i*i <= n: sig_check() if n % i == 0: return i i = i._add_long(1) """) /// }}} {{{id=50| %time trialdivision(n) /// }}}Integers in Sage are implemented using the MPIR library (a fork of GMP). Note the cdef attribute cdef mpz_t value. Here, mpz_t is some C type defined by MPIR. We optimize the divisibility check using the low-level MPIR function mpz_divisible_p. First, we need to cimport that function (I'm lazy so I just cimport everything from sage.libs.gmp.mpz).
{{{id=55| cython(""" from sage.libs.gmp.mpz cimport * from sage.rings.integer cimport Integer def trialdivision(Integer n): cdef Integer i = Integer(2) while i*i <= n: sig_check() if mpz_divisible_p(n.value, i.value): return i i = i._add_long(1) """) /// }}} {{{id=56| %time trialdivision(n) /// }}}The construction <typename>obj can be used to "cast" obj to typename. This is literally a cast like (typename)obj in C. There is also <typename?>obj which is a checked cast: it raises TypeError if obj is of the wrong type. Note the difference between
In Cython you can define Python functions the usual way using def. You can also define a cdef function, which is compiled as a C function without Python calling overhead. A cdef function cannot be imported, it must be cimported (and declared in the .pxd file). Within one Cython module, there is no difference between these two types of function. There is also cdef inline for inline functions.
There is also a cpdef function, which behaves like a cdef function with a Python wrapper. In the case of methods, a cpdef function can be overwritten by a def method (a cdef method only by a cdef method). That is why a class inheriting from Element can be implemented in Cython or Python and _mul_ will "just work".
{{{id=62| cython(""" from sage.rings.integer cimport Integer cdef inline bint is_divisible(n,d): return n % d == 0 def trialdivision(Integer n): cdef Integer i = Integer(2) while i*i <= n: sig_check() if is_divisible(n,i): return i i = i+1 """) /// }}}Note the use of the special type bint. In a C context, it behaves exactly like int, but it gets converted to Python as True/False. For the example above, there is no difference, but using bint is more explicit about the meaning.
{{{id=66| trialdivision(n) /// }}}Cython allows you to define a class just like in Python. However, with cdef class, you can define a so-called "extension type". These are much faster than classes, but do not support all features of classes (like arbitrary attribute assignment):
{{{id=67| Integer.foo = 1 /// }}}Methods can be defined using def, cdef or cpdef. Attributes can be defined using cdef (Cython only), cdef public (Cython + Python), cdef readonly (Cython + read-only in Python):
{{{id=74| cython(""" cdef class MyType(object): cdef str mystr cdef public long myint cdef readonly list mylist def __init__(self, x, y, z): self.mystr = x self.myint = y self.mylist = z def __repr__(self): return "MyType({}, {}, {})".format(self.mystr, self.myint, self.mylist) """) /// }}} {{{id=78| x = MyType("Hello world", 42, range(10)) /// }}} {{{id=91| x /// }}} {{{id=80| x.mystr /// }}} {{{id=85| x.myint /// }}} {{{id=86| x.myint = 43.0 /// }}} {{{id=87| x /// }}} {{{id=88| type(x.myint) /// }}} {{{id=79| x.mylist /// }}} {{{id=84| x.mylist = [] /// }}}