Skip to content

Pull requests: allenai/dolma

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix off-by-one in CodeCopyrightTagger._score span length
#283 opened Apr 12, 2026 by Chessing234 Loading…
1 of 2 tasks
Fix IGNORE_IP_REGEX_START never matching localhost IPs
#280 opened Apr 9, 2026 by Chessing234 Loading…
4 tasks done
Fix typos and clean up leftover draft text in docs
#279 opened Apr 7, 2026 by Chessing234 Loading…
2 tasks
Script to produce dolma 2 ablation config
#275 opened Sep 24, 2025 by soldni Contributor Loading…
Improve WARC processing
#260 opened Apr 15, 2025 by soldni Contributor Draft
first
#240 opened Feb 14, 2025 by Whattabatt Contributor Draft
[WIP DO NOT MERGE] Learn2Code Feature Branch
#233 opened Feb 13, 2025 by cmwilhelm Contributor Loading…
simpler logic for calculating code taggers
#229 opened Feb 12, 2025 by kyleclo Contributor Loading…
Bump openssl from 0.10.66 to 0.10.70 in the cargo group dependencies Pull requests that update a dependency file rust Pull requests that update Rust code
#228 opened Feb 3, 2025 by dependabot bot Loading…
Fixed ignore_existing flag not working as expected.
#224 opened Jan 1, 2025 by soldni Contributor Loading…
New language ID
#223 opened Dec 30, 2024 by soldni Contributor Loading…
Adding support for Classifiers and Search tools
#219 opened Oct 24, 2024 by soldni Contributor Draft
ProTip! Add no:assignee to see everything that’s not assigned.