Canadam 2009 - Sage-words Library
system:sage

<h1 style="text-align: center;">The Sage-words library</h1>
<p style="text-align: center;">by S&eacute;bastien Labb&eacute; and Arnaud Bergeron</p>
<p style="text-align: center;">inspired from a Sage worksheet written by Franco Saliola in November 2008</p>
<h1 style="text-align: center;">Canadam 2009</h1>
<p style="text-align: center;">Montr&eacute;al, May 25th, 2009</p>
<h1>Outline</h1>
<ol>
<li>This talk's objective</li>
<li>A word on Sage</li>
<li>An historic of the sage-words library at LaCIM</li>
<li>New Design to come</li>
<li>Many examples</li>
<li>An example on the study of palindrome complexity</li>
<li>For more informations</li>
</ol>
<h1>This talk's objectives</h1>
<ul>
<li>Existence of an open-source library for research in combinatorics on words.</li>
<li>The community is encourage to use it for their research and improve it.</li>
<li>Give many examples.</li>
<li>Explain how to get more informations.</li>
</ul>
<p>Attention : In 25 minutes, we do <strong>not </strong>have time enough to :</p>
<ul>
<li>Explain Python and Sage syntax</li>
<li>Show many cool possibilities of Sage</li>
<li>Cython</li>
<li>sagetex</li>
</ul>
<h1>A word on sage</h1>
<ul>
<li>Sage is a free open-source  mathematics software system licensed under the GPL. It combines the power of many existing open-source packages into a common Python-based interface.  Its mission is to create a viable free open source alternative to Magma, Maple, Mathematica and Matlab.</li>
<li>Started at Harvard in January 2005 by William Stein.</li>
<li>There are currently <span id="contrib-nb">143</span> contributors in <span id="contrib-places">86</span> different places from   all around the world. </li>
<li>An introductory talk on sage is usely one hour.</li>
</ul>

{{{id=221|
m = matrix(2, [1,2,3,4])
m
///




[1 2]
[3 4]
}}}

{{{id=220|
m.determinant()
///




-2
}}}

{{{id=225|
m.determinant?
///





<built-in method determinant of sage.matrix.matrix_integer_dense.Matrix_integer_dense object at 0xbd6fa4c>
}}}

{{{id=226|
latex(m)
///




\left(\begin{array}{rr}
1 & 2 \\
3 & 4
\end{array}\right)
}}}

{{{id=246|
m.parent()
///




Full MatrixSpace of 2 by 2 dense matrices over Integer Ring
}}}

{{{id=247|

///
}}}

{{{id=227|

///
}}}

<h1>Historic of the combinatorics on words library at LaCIM, UQAM<br /></h1>
<ul>
<li> <span style="font-size: medium;">1990 : Srecko Brlek developped the idea of a tool to study Combinatorics on words. </span></li>
</ul>
<ul>
<li><span style="font-size: medium;">1995 : Patricia Lamas, a student of Brlek, implemented a set of functions for words in the language Scheme. </span></li>
</ul>
<ul>
<li><span style="font-size: medium;">1996 : Workshop organized in Montreal by Brlek (Val&eacute;rie Berth&eacute;, Julien Cassaigne, Sebastien Ferenczi, Michel Koskas, Dominique Bernardi, Jean Paul Allouche, etc) and discussion about a project named CABAC.<br /></span></li>
</ul>
<ul>
<li><span style="font-size: medium;">1997 : Annie Ladouceur, student supervised by Brlek and supported by Dominique Bernardi, Michel Koskas and Jean-Paul Allouche, rewrote in C a combinatorics on words library. </span></li>
</ul>
<ul>
<li><span style="font-size: medium;">2003 : Xavier Proven&ccedil;al continued the precedent work of A. Ladouceur. </span></li>
</ul>
<ul>
<li><span style="font-size: medium;">2006 : Arnaud Bergeron translated into Ruby language all the work of Annie Ladouceur while correcting many bugs. Thierry Montheil (LIRMM) showed an interest. <em>This Ruby package didn't have any interface and the lack of friendliness was also blocking its use. </em></span></li>
</ul>
<ul>
<li><span style="font-size: medium;">Summer 2008: Arnaud Bergeron (Ruby package), Franco Saliola (suffix tree python code), Amy Glen (episturmian code) and S&eacute;bastien Labb&eacute; (morphisms and palindrome complexity python code) merged their work in what is now called the sage-words project. </span></li>
</ul>
<ul>
<li><span style="font-size: medium;">December 2008 : sage-words got merged into sage-3.2.2.The code showed quickly to be slow.<br /></span></li>
</ul>
<ul>
<li><span style="font-size: medium;">Spring 2009: Franco Saliola and S&eacute;bastien Labb&eacute; rewrote the design of sage-words in order to increase its speed. Vincent Delecroix wrote a C datatype that will improve even more the speed of the library. We hope all this to get merged into Sage in the summer. </span></li>
</ul>
<p>&nbsp;</p>
<p>&nbsp;</p>

{{{id=218|
plot(sin(1/x), (x,0,1))
///





<html><font color='black'><img src='cell://sage0.png'></font></html>
}}}

<h1>New Design (summer 2009)<br /></h1>
<p>Goal : Separate the data structures from the mathematical objects.</p>
<p>Mathematical Objects :</p>
<ul>
<li>Classes of words
<ul>
<li>Combinatorial class of all words</li>
<li>Combinatorial class of all words over a given alphabet </li>
</ul>
</li>
<li>Words
<ul>
<li>Finite words</li>
<li>Infinite words</li>
</ul>
</li>
</ul>
<p>Data Structures :</p>
<ul>
<li>Python lists</li>
<li>Python string</li>
<li>Python tuple</li>
<li>Python functions</li>
<li>Python iterators</li>
<li>C++ vector (by Vincent Delecroix, Marseille)</li>
</ul>

{{{id=228|

///
}}}

{{{id=215|

///
}}}

<h1>Many examples<br /></h1>
<p>&nbsp;</p>
<h2>Finite words from python strings, lists and tuples.</h2>
<p>You can also use the <strong>Word</strong> command to construct a word.</p>

{{{id=92|
Word("abbabaab")
///




word: abbabaab
}}}

{{{id=129|
Word([0,1,1,0,1,0,0,1])
///




word: 01101001
}}}

{{{id=195|
Word( ('a', 0, 5, 7, 'b', 9, 8) )
///




word: a057b98
}}}

{{{id=196|

///
}}}

<h2>Finite words from words.</h2>

Words can be concatenated.

{{{id=31|
u = Word("abcccabba")
v = Word([0, 4, 8, 8, 3])
///
}}}

{{{id=131|
u * v
///




word: abcccabba04883
}}}

{{{id=132|
v * u
///




word: 04883abcccabba
}}}

{{{id=230|
u + v
///




word: abcccabba04883
}}}

{{{id=231|
u^3 * v^(8/5)
///




word: abcccabbaabcccabbaabcccabba04883048
}}}

{{{id=97|

///
}}}

<h2>Infinite words from finite words.</h2>

{{{id=233|
vv = v^Infinity
vv
///




word: 0488304883048830488304883048830488304883...
}}}

{{{id=33|

///
}}}

<h2>Finite words from infinite words.</h2>
<p>If you have an infinite word, then you can slice it to get a finite word.</p>

{{{id=99|
vv[10000:10015]
///




word: 048830488304883
}}}

{{{id=237|

///
}}}

{{{id=32|

///
}}}

<h1>Constructing infinite words.</h1>
<h2>Infinite words from functions.</h2>
<p>An <strong>infinite word</strong> can be described by a function $f:\mathbb{N}\rightarrow A$ that takes values in the alphabet $A$.</p>

{{{id=137|
def f(n):
    return n%3
///
}}}

{{{id=138|
u = Word(f)
u
///




word: 0120120120120120120120120120120120120120...
}}}

{{{id=139|
u[:13]
///




word: 0120120120120
}}}

{{{id=248|

///
}}}

{{{id=141|

///
}}}

{{{id=140|
def t(n):
    return add(Integer(n).digits(base=2)) % 2
///
}}}

{{{id=142|
tm = Word(t, alphabet = [0, 1])
tm
///




word: 0110100110010110100101100110100110010110...
}}}

{{{id=143|
tm[:37]
///




word: 0110100110010110100101100110100110010
}}}

{{{id=244|

///
}}}

{{{id=243|
Word(lambda n : add(Integer(n).digits(base=2)) % 2, alphabet = [0, 1])
///




word: 0110100110010110100101100110100110010110...
}}}

{{{id=242|

///
}}}

{{{id=245|

///
}}}

<p>As matrix and many other sage objects, words have a parent.</p>

{{{id=241|
u.parent()
///




Words
}}}

{{{id=144|
tm.parent()
///




Words over Ordered Alphabet [0, 1]
}}}

{{{id=239|

///
}}}

<h2>Collection of all words over an alphabet.</h2>
<p>To create the collection of all words over an alphabet, use the <strong>Words</strong> command.</p>

{{{id=41|
Words([0,1,2])
///




Words over Ordered Alphabet [0, 1, 2]
}}}

{{{id=43|
A = Words("ab")
A
///




Words over Ordered Alphabet ['a', 'b']
}}}

To create a word in this set, pass data that describes the word.

{{{id=93|
A("abbabaab")
///




word: abbabaab
}}}

{{{id=133|
A(["a","b","b","a","b","a","a","b"])
///




word: abbabaab
}}}

{{{id=134|

///
}}}

{{{id=136|
W = Words([0,1,2], length=3)
W
///




Finite Words of length 3 over Ordered Alphabet [0, 1, 2]
}}}

{{{id=135|
W.list()
///




[word: 000, word: 001, word: 002, word: 010, word: 011, word: 012, word: 020, word: 021, word: 022, word: 100, word: 101, word: 102, word: 110, word: 111, word: 112, word: 120, word: 121, word: 122, word: 200, word: 201, word: 202, word: 210, word: 211, word: 212, word: 220, word: 221, word: 222]
}}}

{{{id=250|

///
}}}

{{{id=42|

///
}}}

<h2>Words from iterators.</h2>
<p>Words (finite or infinite) can be constructed using an iterative process.</p>

{{{id=251|
it=iter('abc')
///
}}}

{{{id=252|
it.next()
///




'a'
}}}

{{{id=256|
it.next()
///




'b'
}}}

{{{id=255|
it.next()
///




'c'
}}}

{{{id=257|
it.next()
///




Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/slabbe/.sage/sage_notebook/worksheets/admin/10/code/253.py", line 6, in <module>
    exec compile(ur'it.next()' + '\n', '', 'single')
  File "/home/slabbe/sage-3.4.2/local/lib/python2.5/site-packages/Jinja-1.2-py2.5-linux-i686.egg/", line 1, in <module>
    
StopIteration
}}}

{{{id=259|
Word( iter('abbccdef') )
///




word: abbccdef
}}}

{{{id=260|

///
}}}

{{{id=258|
from itertools import count,repeat
Word( repeat(4) )
///




word: 4444444444444444444444444444444444444444...
}}}

{{{id=261|
Word( count() )
///




word: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,...
}}}

{{{id=254|

///
}}}

<h2>Infinite words from morphisms.</h2>
<p>Let $A=\{a,b\}$ and $\mu : A^* \rightarrow A^*$ be the morphism defined by $a\mapsto ab, b\mapsto ba$.</p>

{{{id=22|
mu = WordMorphism('a->ab,b->ba'); mu
///




Morphism from Words over Ordered Alphabet ['a', 'b'] to Words over Ordered Alphabet ['a', 'b']
}}}

{{{id=203|
print mu
///




WordMorphism: a->ab, b->ba
}}}

{{{id=21|
mu('a')
///




word: ab
}}}

{{{id=20|
mu('a', order=2)
///




word: abba
}}}

{{{id=19|
mu('a', order=3)
///




word: abbabaab
}}}

{{{id=1|
mu('a', order=4)
///




word: abbabaabbaababba
}}}

{{{id=172|
mu('a', order=5)
///




word: abbabaabbaababbabaababbaabbabaab
}}}

{{{id=104|
tm = mu('a', order=Infinity)
tm
///




Fixed point beginning with 'a' of the morphism WordMorphism: a->ab, b->ba
}}}

{{{id=159|
tm[:37]
///




word: abbabaabbaababbabaababbaabbabaabbaaba
}}}

{{{id=25|

///
}}}

<h2>Pre-defined words.</h2>

{{{id=45|
words.FibonacciWord()
///




Fibonacci word over Ordered Alphabet [0, 1], defined recursively
}}}

{{{id=46|
words.FibonacciWord("ab")
///




Fibonacci word over Ordered Alphabet ['a', 'b'], defined recursively
}}}

{{{id=2|
words.ThueMorseWord("ab")
///




Thue-Morse word over Ordered Alphabet ['a', 'b']
}}}

{{{id=105|
words.FixedPointOfMorphism(mu,'a')
///




Fixed point beginning with 'a' of the morphism WordMorphism: a->ab, b->ba
}}}

{{{id=177|
words.ChristoffelWord(7,3,"xy")
///




Lower Christoffel word of slope 7/3 over Ordered Alphabet ['x', 'y']
}}}

{{{id=178|
words.RandomWord(18,5)
///




word: 311004213304004223
}}}

{{{id=179|
Tribonacci = words.StandardEpisturmianWord(Word('abc'))
Tribonacci
///




Standard episturmian word over Python objects
}}}

{{{id=63|
Tribonacci[:40]
///




word: abacabaabacababacabaabacabacabaabacababa
}}}

{{{id=167|

///
}}}

{{{id=68|

///
}}}

{{{id=113|

///
}}}

<h1>Interrogating words</h1>

{{{id=26|
w = Word('abaabbba'); w
///




word: abaabbba
}}}

{{{id=3|
w.is_palindrome()
///




False
}}}

{{{id=49|
w.is_lyndon()
///




False
}}}

{{{id=54|
print w.lyndon_factorization()
///




(ab.aabbb.a)
}}}

{{{id=55|
print w.crochemore_factorization()
///




(a.b.a.ab.bb.a)
}}}

{{{id=205|
st = w.suffix_tree()
st
///




Implicit Suffix Tree of the word: abaabbba
}}}

{{{id=206|
st.show(word_labels=True)
///




<html><font color='black'><img src='cell://sage0.png'></font></html>
}}}

{{{id=7|
w.number_of_factors()
///




28
}}}

{{{id=50|
w.factor_set()
///




{word: , word: aba, word: baa, word: b, word: ab, word: bba, word: ba, word: abbba, word: aabbba, word: baabbba, word: bb, word: abaabbba, word: a, word: aab, word: baabbb, word: aabb, word: abbb, word: bbba, word: aa, word: abb, word: baab, word: bbb, word: abaa, word: baabb, word: aabbb, word: abaabb, word: abaab, word: abaabbb}
}}}

{{{id=182|
T = words.FibonacciWord('ab')
T.longest_common_prefix(Word('abaabababbbbbb'))
///




word: abaababa
}}}

<h2>Currently available commands</h2>

<p>For words:</p>

{{{id=118|
w = Word('asodhfa')
///
}}}

{{{id=279|
w.[TAB]
///


Syntax Error:
    w.[TAB]
}}}

{{{id=268|
for s in dir(w):
   if not s.startswith("_"):
       print s
///


WARNING: Output truncated!  
<html><a target='_new' href='/home/admin/10/cells/268/full_output.txt' class='file_link'>full_output.txt</a></html>



BWT
alphabet
apply_morphism
apply_permutation_to_letters
apply_permutation_to_positions
border
category
charge
coerce
colored_vector
commutes_with
complete_return_words
concatenate
conjugate
conjugate_position
conjugates
count
critical_exponent
crochemore_factorization
db
defect
deg_inv_lex_less
deg_lex_less
deg_rev_lex_less
degree
delta
delta_derivate
delta_derivate_left
delta_derivate_right
delta_inv
dump
dumps
evaluation
evaluation_dict
evaluation_partition
evaluation_sparse
exponent
factor_iterator
factor_occurrences_in
factor_set
find
first_pos_in
freq
good_suffix_table
implicit_suffix_tree
inv_lex_less
inversions
is_balanced
is_cadence
is_conjugate_with
is_cube
is_cube_free
is_empty
is_factor
is_factor_of
is_full
is_lyndon
is_overlap
is_palindrome

...

is_suffix
is_suffix_of
is_symmetric
iterated_left_palindromic_closure
iterated_palindromic_closure
iterated_right_palindromic_closure
lacunas
last_position_dict
last_position_table
length
length_border
lengths_lps
lengths_unioccurrent_lps
letters
lex_greater
lex_less
longest_common_prefix
longest_common_suffix
lps
lyndon_factorization
minimal_period
nb_factor_occurrences_in
nb_subword_occurrences_in
number_of_factors
order
overlap_partition
palindromes
palindromic_closure
palindromic_lacunas_study
parent
parikh_vector
phi
phi_inv
prefix_function_table
primitive
primitive_length
quasiperiods
rename
reset_name
return_words
return_words_derivate
rev_lex_less
reversal
rfind
save
shifted_shuffle
shuffle
size_of_alphabet
standard_factorization
standard_factorization_of_lyndon_factorization
standard_permutation
string_rep
suffix_tree
suffix_trie
swap
swap_decrease
swap_increase
to_integer_list
to_integer_word
version
}}}

<p>For classes of words :</p>

{{{id=270|
W = Words('ab')
///
}}}

{{{id=276|
W.[TAB]
///



Syntax Error:
    W.[TAB]
}}}

{{{id=271|
W.size_of_alphabet()
///




2
}}}

<p>For morphisms:</p>

{{{id=273|
m = WordMorphism({'a':'ab','b':'ababb'})
///
}}}

{{{id=275|
m.[TAB]
///


Syntax Error:
    m.[TAB]
}}}

{{{id=272|
for n in m.list_of_conjugates(): print m
///



WordMorphism: a->ab, b->ababb
WordMorphism: a->ab, b->ababb
WordMorphism: a->ab, b->ababb
WordMorphism: a->ab, b->ababb
WordMorphism: a->ab, b->ababb
WordMorphism: a->ab, b->ababb
}}}

{{{id=265|

///
}}}

<h2>Get help.</h2>

{{{id=280|
Word?
///
}}}

{{{id=277|
w = Word('abbbbababab')
///
}}}

{{{id=278|
w.[TAB]
///


Syntax Error:
    w.[TAB]
}}}

{{{id=209|
w.number_of_factors?
///




44
}}}

{{{id=181|

///
}}}

<p><strong><span style="font-size: x-large;">A study of palindrome complexity.</span></strong></p>
<p><span style="font-size: medium;">Let $w=w_0w_1w_2\cdots$ be a (finite or infinite) word.</span> Let $Pal(w)$ be the set of all palindromes factors of $w$. We define<span style="font-size: medium;"> the palindrome complexity function</span><span style="font-size: medium;"> of the prefixes of $w$ by</span></p>
<p><span style="font-size: medium;">$$\begin{array}{lcll}g_w : &amp; \mathbb{N}&amp;\rightarrow&amp;\mathbb{N}\\ &amp; i &amp; \mapsto &amp; |Pal(w_0w_1\cdots w_{i-1})| \\ \end{array}</span><span style="font-size: medium;">.$$</span></p>
<p><span style="font-size: medium;">, i.e. the number of distincts palindromes in the prefix of length $i$ of $w$. It was shown by Droubay, Justin and Pirillo (2001) that</span></p>
<p><span style="font-size: medium;">$$|Pal(w)| \leq |w|+1.$$</span></p>
<p><span style="font-size: medium;">Then, Brlek, Hamel, Nivat, Reutenauer (200?) defined the defect $D(w)$ of a word $w$ by </span></p>
<p><span style="font-size: medium;">$$D(w)=|w|+1 - |Pal(w)|.$$</span></p>
<p><span style="font-size: medium;">Fixed point of morphisms are divided into four groups according to their palindrome complexity and defect.<br /></span></p>

{{{id=185|
def palindrome_complexity_function(word):
    r"""
    Returns the palindrome complexity function that given an integer n returns
    the number of palindrome of the prefix of length n.
    """
    liste_zero_un = [1]*word.length()
    for lacuna in word.lacunas():
        liste_zero_un[lacuna] = 0
    liste_sum_partielle = [0]
    sum = 0
    for i in liste_zero_un:
        sum += i
        liste_sum_partielle.append(sum)

    return lambda n: liste_sum_partielle[n]

def discrete_plot(f, domain, **kwds):
    r"""
    Returns a discrete plot of the function f.
    """
    return points([(a,f(a)) for a in domain],**kwds)
///
}}}

{{{id=191|
thue = palindrome_complexity_function(words.ThueMorseWord()[:1000])
fibo = palindrome_complexity_function(words.FibonacciWord()[:1000])
fix = palindrome_complexity_function(words.FixedPointOfMorphism('a->abb,b->ba','a')[:1000])
periodic = palindrome_complexity_function((Word('aababbaabbabaa')^Infinity)[:1000])
///
}}}

{{{id=187|
%hide
@interact
def _(length=(10..1000), bound_c=checkbox(default=False, label='Upper bound complexity (red)'), thue_c=checkbox(default=True, label='Thue-Morse word (blue)'), fibo_c=checkbox(default=False, label='Fibonacci word (green)'), fix_c=checkbox(default=False, label='FixPt of a->abb, b->ba (orange)'), per_c=checkbox(default=False, label='Periodic word (black)')):
    rep = None
    if bound_c:
        upper_bound = line([(0,0),(length,length)], rgbcolor='red')
        rep = upper_bound if rep is None else rep + upper_bound
    if fibo_c:
        p = discrete_plot(fibo, range(length), rgbcolor='green' )
        rep = p if rep is None else rep + p
    if thue_c:
        p = discrete_plot(thue, range(length),rgbcolor='blue' )
        rep = p if rep is None else rep + p
    if fix_c:
        p = discrete_plot(fix, range(length),rgbcolor='orange' )
        rep = p if rep is None else rep + p
    if per_c:
        p = discrete_plot(periodic, range(length),rgbcolor='black' )
        rep = p if rep is None else rep + p
    show(rep)
///
}}}

{{{id=188|

///
}}}

{{{id=211|

///
}}}

{{{id=210|

///
}}}

<h2>For more informations.<br /></h2>
<p>On Sage:</p>
<ul>
<li>http://www.sagemath.org/</li>
<li>http://wiki.sagemath.org/Talks</li>
</ul>
<p>On Sage-Combinat:</p>
<ul>
<li>http://wiki.sagemath.org/combinat/</li>
</ul>
<p>... or ask me this week!!</p>
<p>Next conferences:</p>
<ul>
<li class="gap">July 25-29, 2009: *-Combinat 2009.  An International Sage Workshop on <a class="http" href="http://wiki.sagemath.org/combinat/FPSAC09">Free and Practical Software for Algebraic Combinatorics</a> at RISC, Linz, Austria, right after <a class="http" href="http://www.risc.uni-linz.ac.at/about/conferences/fpsac2009/">FPSAC'09</a> </li>
<li class="gap">
<p class="line862">February 22-26, 2010: <a class="http" href="http://wiki.sagemath.org/daysmarseille">Sage days</a>. The thematic month <a class="https" href="https://www.lirmm.fr/arith/wiki/MathInfo2010/MathInfo2010">MathInfo 2010</a> at CIRM, Marseille will include a Sage days session. <a href="http://wiki.sagemath.org/FlorentHivert">FlorentHivert</a>, <a href="http://wiki.sagemath.org/NicolasThi%C3%A9ry">NicolasThi&eacute;ry</a>, and <a class="nonexistent" href="http://wiki.sagemath.org/FrancoSaliola">FrancoSaliola</a> will be among the organizers, there will be a serious combinatorics slant.</p>
</li>
</ul>

{{{id=212|

///
}}}

{{{id=213|

///
}}}