Commit 3e60ec2
committed
Optimize INVALID_NUMBER_FOLLOWED_BY_NAME_REGEXP
Claude Code helped to optimize this potentially slow regexp depending on
the document. Specifically, the backtracking in the original regexp was
the main performance issue.
This does a few things:
1. adds a very simple and faster pre-check to gate if the main regexp
should even run
2. prevents as much backtracking as possible
Benchmarks:
| Document Type | Before | After | Speedup |
|--------------------------------|--------|--------|---------|
| Typical large (125KB) | 0.72s | 0.06s | 12x |
| Colon-heavy (35KB) | 0.24s | 0.007s | 34x |
| Pathological worst-case (26KB) | 1.64s | 0.25s | 6.6x |
| No digits (17KB) | 0.10s | 0.002s | 54x |1 parent 5065756 commit 3e60ec2
1 file changed
Lines changed: 18 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
79 | 79 | | |
80 | 80 | | |
81 | 81 | | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
82 | 91 | | |
83 | 92 | | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | | - | |
88 | | - | |
89 | | - | |
90 | | - | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
91 | 98 | | |
92 | 99 | | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
98 | 104 | | |
99 | 105 | | |
100 | 106 | | |
0 commit comments