Skip to content

Latest commit

 

History

History
31 lines (20 loc) · 684 Bytes

File metadata and controls

31 lines (20 loc) · 684 Bytes

Timetable Parser

Parses timetable PDFs into structured JSON using Camelot and pdfplumber.

Setup

Install dependencies:

pip install -r requirements.txt

Note: camelot-py requires Ghostscript and additional system packages on some platforms.

Usage

python parse_timetable.py samples/timetable.pdf -o samples/timetable.parsed.json

To keep truncated program lines (useful for matching samples/week1.json):

python parse_timetable.py samples/timetable.pdf -o samples/timetable.parsed.json --keep-truncated

To disable resolution from the first page:

python parse_timetable.py samples/timetable.pdf --no-resolve-truncated