Speed up fmpz_mat_charpoly#2691
Open
fredrik-johansson wants to merge 4 commits into
Open
Conversation
Member
|
A few small typos I noticed in this PR:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
We add$O(n^2)$ , gives a slightly better bound for uniform matrices, and gives dramatically better bounds for sparse and non-uniform matrices.
fmpz_mat_charpoly_boundand use this infmpz_mat_charpoly_modular. Instead of the old crude bound based on the height of the matrix, we compute Hadamard bounds for the determinants in the trace sum formula for the charpoly. This still costs onlyWe select appropriately between Berkowitz and the modular algorithm for small matrices in$n \ge 4$ ), speeding up small matrices and matrices with huge entries.
fmpz_mat_charpoly(previously the modular algorithm was always used forThe modular algorithm is reimplemented using asymptotically fast modular reduction and CRT, so the complexity scales quasi-linearly with the bit size of the output and not quasi-quadratically.
We make the modular algorithm multithreaded.
Add helper function
_fmpz_vec_multi_CRT_ui.Part of the code was written with the help of Claude.
Not done: detecting special cases (other than the zero matrix). One could easily detect triangular and Hessenberg matrices and handle these specially. I think nilpotent and single-eigenvalue (charpoly$(x-c)^n$ ) matrices could also be detected and certified faster than the general algorithm (with negligible overhead for generic input). Other cases where the characteristic polynomial is very sparse and has much smaller coefficients than the generic bound might also be doable with early termination based on direct verification, though this gets a bit more complicated (especially to ensure that this is actually only done when it is faster and that the generic case doesn't slow down).
Timings for uniform random matrices with randbits(bits) entries, both single-threaded and multithreaded. Note that "old" includes the preexisting 30% speedup of the modular algorithm from #2684.
Example timings for sparse matrices: the input is the companion matrix of a polynomial with randbits(bits) entries, plus an extra 1 in position (0, 0) of the matrix.