Fast characteristic polynomial algorithm over cyclotomic fields

UPDATE: A first real version of this is now done. See the first two patches at

The timings for this new code on sage.math are given in the table below under "Sage Multimod". Also a theoretical thing to do is give an a priori provable bound on the number of primes needed to get the charpoly via CRT. Right now it just waits until 3 successive charpolys stabilize. Or rather, adding in 3 primes gives the same charpoly. That will in practice always work, but isn't proved. And, it is probably slower than just starting with the right bound.

Craig: I thought for a minute about a bound for the size of coefficients in the charpoly. I came up with a pretty trivial seeming bound (what I would now call a "Hadamard-type" bound), so I looked it up. I found this paper:

I don't know anything about the people in the world of computational linear algebra, but this guy has written papers with Clement, and the FFLAS website is maintained by him, so I figured it was probably the right thing to be looking at. :) So there are two lemmas in the paper that provide a bound for the size of the largest coefficient in the charpoly. I implemented the first of them, and it looked pretty good. Then I implemented the second, and it was wildly worse! It's presented as though it's a sharpening, but I couldn't figure out how, at least in the examples I tried out. I think they may be doing it for a different reason, but in any event, it seemed like we were better off with the first bound. I left the code for both in, with the second in comments, if you want to try it out. I also put in the naive estimates for n = 1,2,3 (the result in the paper is for n > 3): if there's something better, let me know. Maybe we should just directly compute the charpoly in those cases, since it's easy to write down?

Using this code with 23 replaced by 23, 67, and 199 I benchmarked both Sage and Magma on various machines. Here Sage is just using PARI. For 23, Magma is 10 times faster. For 67, Magma is 5-10 times faster. For 199, Magma and Sage are almost the same. So probably they are using exactly the same algorithm, but Magma is more optimized for the small cases.

ModularSymbols_clear_cache()
eps = DirichletGroup(23*3, CyclotomicField(11)).1^2
M = ModularSymbols(eps); M
t = M.hecke_matrix(2)
time t.charpoly()

Magma code to make the matrix and benchmark charpoly:

eps := DirichletGroup(67*3, CyclotomicField(11)).2^2;
M := ModularSymbols(eps,2,0);
t := HeckeOperator(M,2);
time f := CharacteristicPolynomial(t);

The timings themselves are as follows, all on an Opteron 1.8Ghz.

prime

nrows

Sage/PARI

Magma

Sage Multimod

23

16

0.47s

0.03s

0.11s

67

44

10.23s

1.48s

0.11s

199

132

531s

438s

5.34s

331

220

?

8611.50s

19.22s

The 331 matrix is 220x220, and our new algorithm is already 448 times faster than Magma in that case.

By the way, here are similar timings for computing the square of the matrix T_2, i.e., for matrix multiplication:

prime

Sage

PARI

Magma

23

0.05s

0.01s

0.01s

67

0.43s

0.05s

0.01s

199

9.89s

0.64s

0.06s

So Magma's matrix multiply is vastly superior to what is in Sage, but only because Sage's multiply is very stupid and generic.

Regarding charpoly above, the answer isn't very big. Each coefficient looks like this in size:

-13699065951748748504444162373*zeta_11^9 - 30666629423224882851453031398*zeta_11^8 - 
    17759717829637529333530323750*zeta_11^7 + 18956836030606298117040309088*zeta_11^6 - 
    17759717829637529333530323750*zeta_11^5 - 30666629423224882851453031398*zeta_11^4 - 
    13699065951748748504444162373*zeta_11^3 - 21360349014330060044744277916*zeta_11 - 
    21360349014330060044744277916

Probably that should take at most 350/5 = 70 primes to get using a multimodular algorithm. In magma, charpoly of a random 132x132 matrix over GF(10007) done 70 times takes 0.63 seconds. The same in Sage takes 2.46 seconds. This time plus the overhead of CRT should be about the time it takes to do the charpoly. (On OS X Sage takes 1.07 seconds and Magma 0.43 seconds.) So I don't see why Sage shouldn't be able to do this charpoly in about 10 seconds.

Here is a little demo proof-of-concept illustrating computing the 67 charpoly more quickly by doing it over ZZ and using p-adic reconstruction. I do not think this is the way to go -- I think straight multimodular is -- but this already illustrates that both PARI and Magma are pretty stupid right now.

def padic_cyclotomic_reconstruction(K, w, p, prec, phi):
    n = K.degree()
    zeta = K.gen()
    X = [zeta^i for i in range(n)] + [w]
    A = matrix(ZZ, n + 2, n + 2)
    for i in range(len(X)):
         A[i,i] = 1
         A[i,n+1] = Mod(phi(X[i]), p^prec).lift()
    A[n+1, n+1] = p^(prec-1)
    L = A.LLL()
    #print L
    rr = L[1].copy()
    rr[0] -= rr[-1] 
    alpha = -1/rr[-2]
    lin_comb = rr[:-2]*alpha
    return K(lin_comb.list())


def charpoly_cyclo(A, prec=20, pstart=389):
    # Compute charpoly of A using p-adic cyclotomic algorithm.
    K = A.base_ring()
    n = K.number_of_roots_of_unity()
    p = pstart
    while p % n != 1:
        p = next_prime(p)
    print "p = ", p
    f = K.defining_polynomial()
    C = pAdicField(p, prec)
    R = f.roots(C)
    phi = K.hom(QQ(R[0][0].lift()), check=False)
    pp = p^prec
    B = matrix(QQ, A.nrows(), A.ncols(), [phi(w)%pp for w in A.list()])
    time f = B.charpoly()
    return [padic_cyclotomic_reconstruction(K, w, p, prec, phi) 
               for w in f.list()]
///

ModularSymbols_clear_cache()
eps = DirichletGroup(67*3, CyclotomicField(11)).1^2
M = ModularSymbols(eps); M
t = M.hecke_matrix(2)
time f = t.charpoly()
///

Time: CPU 8.30 s, Wall: 8.78 s

time B = charpoly_cyclo(t, 30, next_prime(10000))
///

p =  10099
Time: CPU 2.26 s, Wall: 2.64 s
CPU time: 3.56 s,  Wall time: 4.14 s

B[0]
///

13383*zeta11^9 + 2223*zeta11^8 + 12771*zeta11^7 + 783*zeta11^6 + 783*zeta11^5 + 12771*zeta11^4 + 2223*zeta11^3 + 13383*zeta11^2 + 24129

f[0]
///

13383*zeta11^9 + 2223*zeta11^8 + 12771*zeta11^7 + 783*zeta11^6 + 783*zeta11^5 + 12771*zeta11^4 + 2223*zeta11^3 + 13383*zeta11^2 + 24129

B[10]
///

-588700556*zeta11^9 - 965692897*zeta11^8 + 754459476*zeta11^7 + 1319584698*zeta11^6 + 754459476*zeta11^5 - 965692897*zeta11^4 - 588700556*zeta11^3 - 2199625369*zeta11 - 2199625369

f[10]
///

-588700556*zeta11^9 - 965692897*zeta11^8 + 754459476*zeta11^7 + 1319584698*zeta11^6 + 754459476*zeta11^5 - 965692897*zeta11^4 - 588700556*zeta11^3 - 2199625369*zeta11 - 2199625369

cyclo/charpoly (last edited 2019-11-23 17:39:01 by chapoton)