GitHub - laluxx/monadc: The Monad programming language compiler

A Pure Functional Language from High-Level Abstraction to Bare Metal

Mathematical purity. Lisp soul. Zero-overhead hardware control.

Editor Support: For the best experience, use monad-mode — a dedicated Emacs major mode providing syntax highlighting, REPL integration, inline assembly support, linting and much more!

1. Repository Map

.
├── *.c / *.h          — Compiler source (22 C files + 21 headers)
│   reader.c/h         │   Lexer, parser, AST, pattern-match desugar
│   wisp.c/h           │   Wisp syntax → S-expression expander
│   dep.c/h            │   Dependent type checker (ITT kernel, NbE, bidir)
│   infer.c/h          │   Hindley-Milner type inference
│   codegen.c/h        │   LLVM IR codegen (83+ dispatch symbols)
│   types.c/h          │   Type representation and operations
│   env.c/h            │   Environment (symbol table)
│   runtime.c/h        │   Runtime support (lists, sets, maps, GC)
│   arena.c/h          │   Arena allocator
│   asm.c/h            │   Inline assembly translator
│   ffi.c/h            │   C FFI (libclang header parsing)
│   pmatch.c/h         │   Pattern-match compilation
│   macro.c/h          │   Macro expansion
│   module.c/h         │   Module system
│   module_export.c/h  │   Module export resolution
│   cli.c/h            │   CLI argument parsing
│   repl.c/h           │   Interactive REPL
│   features.c/h       │   Conditional compilation (*features*)
│   buildsystem.c/h    │   Package build system (package.yaml)
│   typeclass.c/h      │   Type class support
│   typst_emit.c/h     │   Typst math document emission
│   main.c             │   Entry point, pipeline orchestration
│
├── core/              — Standard library (Monad source)
│   ├── prelude/       │   Coll.mon, Test.mon
│   ├── text/          │   Ascii.mon, String.mon
│   ├── List.mon       │
│   ├── Set.mon        │
│   ├── Math.mon       │
│   ├── Random.mon     │
│   ├── Assert.mon     │
│   ├── json.mon       │
│   └── TODO.mon       │
│
├── tests/             — Test suite (~1400+ test cases)
│   ├── run.py         │   Formatted test runner
│   ├── codegen/       │   200+ compile/run .mon + .stdout pairs (includes wisp tests)
│   ├── language/      │   600+ language feature tests
│   └── sugar/         │   100+ sugar desugar tests
│
├── how_to/            — How-to guides (executable .mon files)
│   ├── AlgebraicDataTypes.mon
│   ├── Layouts.mon
│   ├── Macros.mon
│   ├── RefinementType.mon
│   ├── Syntax.mon
│   ├── Test.mon
│   ├── Iter.mon
│   ├── RaylibSpeedrun.mon, GlfwSpeedrun.mon, SdlSpeedrun.mon
│   └── Term/           — Dependent type examples
│
├── context/           — Context system (LLM external memory)
│   ├── context.org    │   Root entry
│   ├── index.org      │   Schema, node index
│   ├── reader.org     │   Reader/parser context
│   ├── language.org   │   Language formalization
│   ├── build.org      │   Build/install
│   ├── tests.org      │   Test metadata, coverage
│   ├── workflow.org   │   User preferences
│   ├── rules.org      │   Repository rules
│   ├── todo.org       │   TODO items
│   ├── visualizer.org │   Context visualizer docs
│   ├── opinions.org   │   LLM critique
│   ├── philosophy.org │   Language philosophy
│   source-map.org │   Every source file's purpose and connections
│   ├── style.org       │   Code style taxonomy
│   ├── navigation.org  │   ID prefix → file routing for quick LLM nav
│   ├── references.org  │   Research paper bibliography & cross-links
│   ├── wisp.org       │   Wisp expander context (architecture, arity, sugars)
│   └── commit-format.org
│
├── etc/               — Assets (logo, screenshots, research PDFs)
│   └── pdfs/          │   Research papers informing the design
├── Makefile           — Build (gcc + LLVM)
├── monad              — Compiled compiler binary
└── libmonad.a         — Static runtime library

2. Philosophy

Note on naming: The codebase uses "Monad" (without the C suffix) for the language name throughout source files, type constructors, and documentation. The repository is named monadc (compiler) but the language itself is "Monad." This README follows the codebase convention.

Monad is built on a conviction: the choice between high-level mathematical safety and low-level hardware control is a false dichotomy. A language should give you dependent types AND inline assembly, refinement types AND naked functions, lazy infinite lists AND C-compatible arrays — without binding generators, FFI wrappers, or runtime overhead.

See context/philosophy.org for the full witnessed philosophy, and context/opinions.org for LLM critique.

Core principles:

No trade-off between safety and control — The dual type system (HM + ITT), the TERM_EMBED bridge, naked functions, and inline assembly serve a single goal: you should not have to choose between dependent types and hardware access.
Purity is practical — Functions are relations, collections have set-theoretic semantics, and types are first-class. The compiler can reason about programs in ways imperative languages cannot, enabling compile-time predicate checking and zero-cost monomorphization.
Two syntaxes are one language — S-expressions and Wisp are interchangeable notations for the same semantics. They mix freely within a single file.
Sets are the model — Types are sets. Functions are relations. Refinement types are subsets. Universes are sets of sets.
Zero to metal — Start with show "Hello, World!", add type annotations, introduce polymorphism, prove invariants with dependent types, drop to layout-based structs, and write inline assembly — all within the same function.

3. Compiler Pipeline (11 phases)

The compiler runs 11 sequential phases (plus a Phase 0 FFI pre-pass), orchestrated by main.c:compile_one(). Detailed OBS records for each phase are in context/language.org::* Compilation Pipeline.

source.mon  ──►  Phase 0: FFI Pre-pass          ──►  Phase 1: Wisp Expansion + Parsing
                       (libclang C header parse)             (wisp_parse_all → desugared AST)
                                                             (optional JSON/Typst emit)

Phase 1  ──►  Phase 2: Module/Import Decls  ──►  Phase 3: LLVM Setup + Builtins
                   (extract module/import forms)          (LLVM init, 94 codegen + dep builtins,
                   (recursive compile dependencies)        declare_runtime_functions)

Phase 3  ──►  Phase 4: External Declarations  ──►  Phase 5: Init/Main Function
                   (declare import externals)                 (create LLVM fn, position builder)

Phase 5  ──►  Phase 6: *features* Global  ──►  Phase 6.5: Dependent Type Checking
                   (feature-conditional compilation)        (dep_toplevel, bidir + NbE, fatal)

Phase 6.5  ──►  Phase 7: Codegen  ──►  Phase 8: Function Termination
                   (codegen_expr per AST node,                  (main → return 0, lib → ret void)
                    83+ strcmp dispatch, 94 builtins)

Phase 8  ──►  Phase 9: Registry + Name Mangling  ──►  Phase 10: Verify + Optional Emit
                   (CompiledModule, mangle)                     (LLVMVerifyModule → .ll/.s/.bc/.o)

Phase 10  ──►  Phase 11: Object File Emit  ──►  executable (main)  or  .o library
                   (emit_object, skip if up-to-date)

Phases are timed with PHASE_START()/PHASE_END() macros and logged to stderr. See main.c:525-1290 for the full pipeline implementation.

4. Source File Guide

4.1 Reader & Parser (`reader.c/h`, `wisp.c/h`)

File	Purpose
`reader.c`	Lexer, S-expression parser, AST construction (20 AST node types), pattern-match desugar, comment maps
`wisp.c`	Wisp syntax expander (indentation-driven → S-expressions), arity table, quote/backtick desugar

AST node types (reader.h:39-63):

Node	Example	Purpose
`AST_NUMBER`	`42`, `3.14`	Numeric literal
`AST_SYMBOL`	`x`, `+`	Identifier
`AST_STRING`	`"hello"`	String literal
`AST_PATH`	`./file`, `~/path`	Filesystem path literal
`AST_CHAR`	`'c'`, `#\A`	Character literal
`AST_LIST`	`(f x y)`, `'(1 2 3)`	S-expression list
`AST_KEYWORD`	`:keyword`	Keyword literal
`AST_RATIO`	`1/3`	Rational literal
`AST_ARRAY`	`[1 2 3]`	Array literal
`AST_SET`	`{1 2 3}`	Set literal
`AST_MAP`	`#{"k" v}`	Map literal
`AST_LAMBDA`	`(lambda ...)`	Lambda expression
`AST_ASM`	`(asm ...)`	Inline assembly
`AST_REFINEMENT`	`(type Positive ...)`	Refinement type
`AST_TESTS`	`(tests ...)`	Test block
`AST_ADDRESS_OF`	`&x`	Address-of operator
`AST_RANGE`	`(0..10)`	Range literal
`AST_LAYOUT`	`(layout Point ...)`	Struct layout
`AST_PMATCH`	pattern clauses	Pattern matching
`AST_DATA`	`(data Color ...)`	Algebraic data type
`AST_CLASS`	`(class Eq a ...)`	Type class declaration
`AST_INSTANCE`	`(instance Eq Int ...)`	Type class instance
`TOK_LAMBDA_LIT`	`λx. body`	Pure lambda literal

Path literals are values: define path ~/xos/projects/c/monadc/context/ defines path with type Path, and (show path) prints the path text.

4.2 Type System

The type system has two interlocking layers, implemented across 4 source files:

File	Responsibility
`types.c/h`	Type representation, kind system (31 type kinds), type operations (clone, free, unify, subst, to_llvm)
`infer.c/h`	Hindley-Milner inference: constraint gen → unification → generalization → instantiation → zonking
`dep.c/h`	Dependent type checker: ITT kernel with NbE, bidir checking, metavariables, elaboration
`typeclass.c/h`	Type class resolution and instance derivation

Hindley-Milner layer (infer.c) — automatic inference with polymorphic schemes:

(define (id x) x)          ; => ∀a. a -> a
(define (const x y) x)     ; => ∀a b. a -> b -> a

(id 42)                    ; Int -> Int
(id "hi")                  ; String -> String

Polymorphism is compiled via monomorphization — zero-cost abstraction.

ITT kernel (dep.c) — full Intensional Type Theory with:

Concept	Implementation	Detail
Universe hierarchy	`Type u : Type (u+1)`	Predicative, cumulative
Dependent functions	`Π(x:A). B`	Bidirectional checking
Dependent pairs	`Σ(x:A). B`	With `fst`/`snd` projections
Equality	`t ≡ s : A`	`refl` + `subst`
Natural numbers	`Nat`, `zero`, `succ`, `Nat-elim`	Primitive recursion
Definitional equality	Normalization by Evaluation	Fuel: 100000 steps
Representation	Locally nameless	De Bruijn indices + global names
Elaboration	Metavariables + implicit args	`dep_conv` solver
Ground bridge	`TERM_EMBED`	HM ↔ ITT interop

Ground type hierarchy (types.h:8-49):

Category	Types
Primitives	`Int`, `Float`, `Char`, `String`, `Symbol`, `Bool`
Numeric	`Hex`, `Bin`, `Oct`, `Ratio`
Fixed-width	`I8`, `U8`, `I16`, `U16`, `I32`, `U32`, `I64`, `U64`, `I128`, `U128`
Extended	`F32`, `F80`
Compound	`Arr`, `List`, `Set`, `Map`, `Fn`, `Layout`
Pointer	`Ptr :: T`
Optional	`T?` (sum-type encoding)
Application	`Maybe Float`, `Arr :: Int :: 3`
Refinement	`Natural { x ∈ Int

Refinement types attach predicates to ground types as compile-time-checked subtypes, automatically generating predicate functions and materialized Set values.

type Natural { x ∈ Int | x >= 0 }
  :alias ℕ

(define (isNatural [x :: ℕ] -> Bool)
  (Natural? x))

show (isNatural 1)   ; => True

Type classes and ADTs:

(class Eq a where
  (=)  :: a -> a -> Bool
  (!=) :: a -> a -> Bool
  (x != y) => (not (x = y)))

data TrafficLight Red | Yellow | Green

data Shape
  Circle    Float
  Rectangle Float Float
  Triangle  Point Point Point

4.3 Codegen (`codegen.c/h`)

The codegen module translates desugared AST into LLVM IR. It is the largest file in the compiler (~13,000 lines).

Dispatch architecture:

codegen_expr() switches on AST type, then for AST_LIST heads dispatches via 83+ strcmp calls (lines 3889-12492)
17 type predicates handled at codegen.c:5458-5488
94 builtins registered via register_builtins() at codegen.c:12960-13119
2nd registration layer: infer_register_builtins() adds type-annotated variants

Callability rule: A function is callable from paren-mode code ONLY if it has a strcmp handler. Functions that only appear in register_builtins() (like concat, substring, reverse, nth, length, reduce, fn/lambda, cond, let) are NOT directly callable — they lack the strcmp dispatch entry despite being registered in the env.

Working dispatch categories (all now covered by tests):

Category	Count	Key symbols
Arithmetic	10	`+`, `-`, `*`, `/`, `%`, `mod`, `1/4` ratio-as-function
Bitwise	6	`&`, `~`, `bit-xor`, `<<`, `>>`, `>>>`
Comparison	7	`=`, `!=`, `<`, `>`, `<=`, `>=`, `equal?`
Logic	3	`and`, `or`, `not`
Control	4	`if`, `begin`, `unless`, `while`
Type preds	18	`Int?`, `Float?`, `Number?`, `String?`, `nil?`, etc.
Type convs	8	`Int`, `Float`, `Char`, `Hex`, `Oct`, `Bin`, `Arr`, etc.
String	6	`make-string`, `count`, `starts-with?`, `ends-with?`, `contains?`, indexing
List	10	`list`, `car`, `cdr`, `cons`, `.`, `++`, `empty?`, `nil?`, `pair?`, `make-list`
Set	8	`conj`, `disj`, `conj!`, `disj!`, `contains?`, `Set?`, `collection?`, set-as-function
Other	10	`quote`, `define`, `set!`, `show`, `??`, `address-of`, array-index, etc.

4.4 Runtime (`runtime.c/h`, `arena.c/h`)

The Runtime Value System — RuntimeValue tagged unions, all 14 type tags, value construction/unboxing API, thunk/lazy evaluation, cons-cell lists, open-addressing hash sets/maps, and the closure ABI — is documented in detail in context/language.org::* Runtime Value System.

File	Purpose
`runtime.c`	Cons cells, list ops, set/map ops, GC (ref-count + cycle detection), `show` printer, keyword/value/ratio constructors
`arena.c`	Bump allocator for compiler-internal allocations

Known runtime limitations (test suite verifies these):

empty? on set literal {} returns False (should be True)
tail on lists segfaults
Map literal #{"k" v} and set literal {1 2 3} produce garbage values without type annotation (:: Map / :: Set)
assoc, dissoc, merge, keys, vals, find dispatch correctly but produce garbage due to map runtime issues
List length and reverse are registered builtins but lack strcmp dispatch — unreachable from paren-mode

4.5 FFI and Inline Assembly (`ffi.c/h`, `asm.c/h`)

File	Purpose
`ffi.c`	C header parsing via libclang, type mapping (C → Monad), arity registration
`asm.c`	Inline assembly IR generation, register allocation, naked function support

Direct C header inclusion — no bindings needed:

include <raylib.h>

InitWindow 1920 1080 "No Bindings!?"

define color Color 33 33 33 0

until WindowShouldClose
  BeginDrawing
  ClearBackground color
  DrawFPS 600 600
  EndDrawing

Inline assembly:

(define (rdtsc -> Int)
  "Read the CPU timestamp counter."
  (asm rdtsc
       shl 32   %rdx
       or  %rdx %rax))

Naked functions (:naked True) remove prologue/epilogue for full control.

4.6 Module System (`module.c/h`, `module_export.c/h`, `buildsystem.c/h`)

File	Purpose
`module.c`	Module/import declaration parsing, dependency compilation, registry
`module_export.c`	Export resolution, visibility rules
`buildsystem.c`	`package.yaml` parsing, `monad new/build/run/install/clean`

Import variants:

(import Ascii)                           ; everything
(import Math hiding [cos sin])           ; except
(import Math [sqrt log])                 ; only
(import qualified Math :as M)            ; qualified as M.sqrt

Package format (package.yaml):

name:                my-package
version:             0.1.0.0
dependencies:
- core >= 0.1 && < 1.0
executables:
  my-package:
    main: Main.mon
    source-dirs: src

4.7 Other Compiler Modules

File	Purpose
`env.c/h`	Symbol table (hash map), entry kinds (VAR, FUNC, BUILTIN, LAYOUT, MACRO, MODULE)
`pmatch.c/h`	Pattern-match compilation: coverage checking, clause desugaring to if/case chains
`macro.c/h`	Macro expansion (define-macro, syntax-rules)
`features.c/h`	Conditional compilation via `#+TAG` / `#---` directives, `features` global
`cli.c/h`	CLI argument parsing, help text
`repl.c/h`	Interactive REPL with readline support
`typst_emit.c/h`	AST → Typst math document → PDF pipeline
`main.c`	Entry point, pipeline orchestration, LLVM module management

5. Syntax: S-Expressions and Wisp

Every construct can be written in two equivalent forms:

;; S-expression form
(define [var :: Int] 3)
(define (square [x :: Int] -> Int) (* x x))

;; Wisp form (identical semantics)
define var :: Int 3
define square :: Int -> Int
  x -> x * x

;; Parenthesized headers are also valid in bare Wisp define form.
define (firstItem [n : Int] [xs : Arr] -> Int)
  _ []     -> 0
  n [x|xs] -> x

See how_to/Syntax.mon for comprehensive examples.

Wisp Sugar Notes

Typed value define sugar: define x :: Int 42 is Wisp sugar for the Lisp-layer (define [x :: Int] 42). Bracketed define [x :: Int] 42 and define [x : Int] 42 remain valid.
Signature clause sugar: define square :: Int -> Int followed by x -> x * x binds x as the parameter and expands to a normal named-parameter function. Pattern clauses with _, literals, [], [x|xs], or guards remain pattern clauses.
Parenthesized Wisp define headers: define (f [x : Int] -> Int) is accepted without wrapping the whole form in parentheses, so Wisp can reuse Lisp-style function headers directly.
Path values: valid Unix path tokens such as ~/xos/projects/c/monadc/context/ are first-class Path values in Wisp and Lisp forms.
?-sugar (method-call): (contains? var arg) transforms to (var contains? arg) when var is an untyped identifier. Workaround: annotate the variable (define [s :: Set] ...) or use a literal first arg.
!-sugar (mutation): (conj! var arg) → (var conj! arg). Same constraints and workaround as ?-sugar.
Pipe sugar: x |> f |> g chains as (g (f x)). Wisp uses the arity table for grouping.
Inline if/then/else: if expr then stuff else stuff (Form 3) and multiline if expr then stuff else stuff (Form 2) both work.

6. Type System (Reference)

6.1 Hindley-Milner Inference

The HM layer (infer.c) provides automatic type inference with polymorphic schemes via:

Constraint generation — traverse AST, emit type equations
Unification — Robinson's algorithm
Generalization — quantify free type variables
Instantiation — fresh variables at call sites
Zonking — propagate substitution onto AST

6.2 Dependent Types (ITT)

The dependent type layer (dep.c) implements a full ITT kernel:

Bidirectional checking (Löh, McBride, Swierstra 2010):
- Inference (Γ ⊢ t ⇒ A): variables, annotations, applications, type constants
- Checking (Γ ⊢ t ⇐ A): lambda bodies, pair introductions, refl
Definitional equality via NbE with fuel limit (DEP_CONV_FUEL = 100000)
Locally nameless representation — De Bruijn indices + global names
Metavariables for elaboration — holes (_) and implicit args ({x:A})
Ground type embedding via TERM_EMBED / dep_ground_of_value

;; Dependent function
(define (append [xs :: List a] [ys :: List a] -> List a)
  ...)

;; Implicit arguments
(define (id {A :: Type} [x :: A] -> A) x)

6.3 Refinement Types

type Positive { x ∈ Int | x > 0 }
type Even     { x ∈ Int | x % 2 = 0 }

(type PositiveEven
  { n ∈ Int
  | ((Positive? n) and (Even? n)) })

Each refinement generates a predicate (Positive?) and a materialized Set (visualized as {1 2 3 4 ...} in the REPL).

6.4 Type Classes

See typeclass.c for the compiler implementation and how_to/AlgebraicDataTypes.mon for examples.

7. Quick Start

Installation:

git clone https://github.com/laluxx/monadc.git
cd monadc
sudo make install

First program (Hello.mon):

show "Hello, World!"

monad Hello.mon

REPL:

monad -i

Run tests:

python3 tests/run.py

Context visualizer dashboard:

python3 context-visualizer.py

Package management:

monad new my-project
monad build
monad run
monad install

8. CLI Reference

All commands and flags from cli.h / cli.c:

Modes

Command	Description
`monad <file.mon>`	Compile source file to executable
`monad -i`	Start interactive REPL
`monad new <name>`	Scaffold a new package
`monad build`	Build package (requires package.yaml)
`monad run`	Build and run package
`monad clean`	Remove build/ + .o .ll *.s
`monad install`	Install to `~/.local/bin/`
`monad test <file.mon>`	Compile with tests, run, delete test binary
`monad --test <file>`	Compile with tests embedded, keep binary

Flags

Flag	Effect
`-o <file>`	Output file name
`--emit-ir`	Emit LLVM IR (`.ll`)
`--emit-bc`	Emit LLVM bitcode (`.bc`)
`--emit-asm`	Emit assembly (`.s`)
`--emit-obj`	Emit object file (`.o`)
`--emit-json`	Emit AST as JSON (`.json`)
`--emit-typst`	Emit Typst math document (`.typ`) and compile to PDF
`--test`	Compile with test blocks embedded
`-Wall`	Accepted (no-op)
`-Wextra`	Accepted (no-op)
`-h`, `--help`	Show help message

9. Language Tour

Variables and Built-in Types

(define i    20)      ; Int
(define f    40.0)    ; Float
(define c    'c')     ; Char
(define s    "hello") ; String
(define frac 1/3)     ; Ratio — true rational arithmetic
(define h    0xFF)    ; Hex   (255)
(define b    0b1010)  ; Bin   (10)
(define o    0o755)   ; Oct   (493)

Types are first-class casting functions:

(Int   40.3)   ; => 40
(Float 20)     ; => 20.0
(Char  65)     ; => 'A'
(Hex   65)     ; => 0x41
(Arr   20)     ; => [20]

Functions

;; Required parameters (→)
(define (multf [x :: Float] → [y :: Float] → Int)
  "Multiply X and Y."
  (* x y))

;; Optional parameters (⇒)
(define (multh [x :: Hex] ⇒ [y :: Hex] → Int)
  (if (unspecified? y) x (* x y)))

;; Automatic infix — no backticks needed
(newcomers into players)   ; => (into players newcomers)
(3 + 4)                    ; => (+ 3 4)

;; Function metadata
define pow :: Float → Float → Float
  :doc   "Raise BASE to the power EXP."
  :alias ^
  base exp → if exp = 0.0 1.0 else base * base ^ (exp - 1.0)

Pattern Matching

(define (last List → a)
  []     → nil
  [x]    → x
  [_|xs] → (last xs))

Guards, wildcards, variable binding, list destructuring, constructor patterns — all compiled via pmatch_desugar with exhaustive coverage verification.

Collections

Type	Syntax	Evaluation	Memory
Lists	`'(1 2 3)`	Lazy	Thunks → memoized
Arrays	`[1 2 3]`	Strict	C-compatible
Vectors	`#[...]`	SIMD-aligned	Homogeneous

define naturals (0..)     ; infinite lazy list — O(1) memory
take 5 naturals           ; => '(0 1 2 3 4)

define [arr :: Arr :: Int :: 3] [1 2 3]

;; Boolean masking
(define v #[10 20 30 40 50])
(define mask (> v 25))    ; => #[#f #f #t #t #t]
(v mask)                  ; => #[30 40 50]

Maps, Sets, and Relations

(define scores #{"Fred" 1400  "Bob" 1240})
(assoc  scores "Sally" 0)   ; add
(dissoc scores "Bob")       ; remove
(scores "Fred")             ; => 1400

;; Relational algebra
(define ages {("Alice", 30) ("Bob", 25)})
(domain   ages)          ; => {"Alice" "Bob"}
(image    ages "Alice")  ; => {30}
(ages ∘ life-stages)      ; Relational composition

Layouts (Structs)

(layout Point
  [x :: Float]
  [y :: Float])

(define p (Point 1.0 2.0))
p.x  ; => 1.0
p.y  ; => 2.0

Layouts support pointer fields, inline nesting, packed alignment, and array-of-layout fields. See how_to/Layouts.mon for details.

Conditional Compilation

#+WINDOWS
(show "Compiled for Windows")
#---

#+LINUX
#+X86-64
(show "Compiled for Linux x86-64")
#---

Per-Module Test Suites

(tests
  (assert-eq (inc 0)   1  "inc 0")
  (assert-eq (sum (filter even? '(1..20))) 110 "sum of evens 1..20"))

monad test Module.mon

See how_to/Test.mon for guide.

10. Context System

The context system is a file-system-based hippocampus — an external, persistent memory that survives across LLM instantiations. It is a corpus of Org-mode files recording every durable fact, decision, observation, inference, and TODO about the compiler, structured as a category-theoretic graph.

10.1 Category Theory Foundation

Category concept	Repository artifact	Example
Object	context heading, TODO, test, source anchor	`tests.reader.path-heap-literals`
Morphism	typed relation between objects	`verifies`, `supports`, `evidenced by`
Composition	chained meaning preservation	test verifies context, context links source
Functor	executable test suite view from semantic context	context contract → automated check

10.2 Context Visualizer

A live browser-based dashboard renders the entire graph:

python3 context-visualizer.py

Features: animated force graph, fuzzy search (C-n/C-p), Hoogle method browser, TODO dashboard, editor integration, morphism arrows.

The visualizer supports category-based filtering — different views into the graph's subcategories. Three key views are captured below:

Language Explorer — the decision subset of the context category, filtering to type-system and language-primitive decision nodes. Each node is a design decision with its source anchors, cross-links, and philosophical grounding. The left panel shows the Hoogle-like method browser: every core and prelude function is queryable by name or type signature, with directed call/caller trees projected onto the graph. This replaces traditional API docs with a live, navigable knowledge graph.

Full Category — the entire context graph with all categories unfiltered. Every object (context heading, test fixture, TODO, source anchor) and every morphism (verifies, supports, evidenced-by, contains) is visible. The force-directed layout clusters related nodes: decisions form one region, tests form another, observations orbit their sources, and cross-cutting edges span between them. This is the live map of the project's memory.

Tests Subset — the test nodes of the category filtered to show the test–context connection graph. Each test fixture (from codegen `.mon` files, reader-atoms.tsv, reader-sugars.tsv, or language-corpus.tsv) is an object, and the `verifies` morphism connects each test to the context nodes that explain what it tests and why it matters. Yellow nodes mark TODOs; green nodes mark passing tests; red nodes mark known failures documented in the context. This is the functor from the context category into the category of executable verification.

10.3 Test Metadata

Every test fixture carries metadata linking it to context nodes:

;; TEST-ID: tests.codegen.rt-hm-polymorphic-identity
;; TEST-CONTEXT: monadc.context.language.type-system-surface
;; TEST-PURPOSE: polymorphic identity instantiates at Int and String
;; TEST-ATOM: (define (id x) x) works at Int and String call sites
;; TEST-EXPECT: compile, run

10.4 Context File Index

See context/index.org for the full schema specification.

File	Purpose
`context.org`	Root entry point, includes full index
`index.org`	Schema, format specification, node index
`reader.org`	Reader/parser context (AST, tokens, Wisp)
`language.org`	Language formalization and type system
`build.org`	Build/install commands and verification
`tests.org`	Test metadata, philosophy, context links
`workflow.org`	Standing user preferences and workflow
`rules.org`	Standing repository rules
`todo.org`	Project TODO notes with confidence levels
`visualizer.org`	Visualizer implementation docs
`opinions.org`	LLM critique and suggestions for context
`philosophy.org`	Language and project philosophy (witnessed)
`wisp.org`	Wisp expander architecture, arity, sugars
`references.org`	Research paper bibliography & cross-links
`commit-format.org`	Commit message format conventions

10.5 Context Node Documentation

Every context section (node) carries a CONTEXT_DESCRIPTION property in its Org drawer explaining what that node IS — a human-readable summary of the concept, its role in the compiler, and why it exists. This description is rendered by the visualizer when a node is selected, replacing traditional API docs with a live navigable knowledge graph.

Example from context/wisp.org:

:CUSTOM_ID: wisp-architecture
:CONTEXT_KIND: component
:CONTEXT_DESCRIPTION: The Wisp syntax expander architecture: a two-pass process
  (tokenization then layout expansion) that converts indentation-based Wisp into
  standard S-expressions before the reader sees them.

Current corpus: ~1400 nodes, ~1500 edges — all connected through verified morphisms.

11. Test Infrastructure

Layer	Artifact	Size	Role
Atoms	`tests/reader-atoms.tsv`	772	Parser conformance atoms
Sugar	`tests/reader-sugars.tsv`	192	Sugar desugar tests
Codegen	`tests/codegen/*.mon`	200+	Compile/run fixtures (incl. wisp)
Language	`tests/language/*.mon`	600+	Language feature tests
Runner	`tests/run.py`	—	Formatted execution engine

Current status: ~1400+ tests across TSV-based parser coverage, codegen compile/run pairs, and language feature tests. Pass/fail counts fluctuate as coverage expands; see context/tests.org for the current state.

Wisp tests are consolidated under tests/codegen/ (formerly in tests/wisp/). Wisp sugar test fixtures for ?-sugar, !-sugar, pipe, and inline/multiline if/then/else exercise both compilation and runtime output.

Run the suite:

python3 tests/run.py

See context/tests.org for coverage analysis and known failures.

How-to guides in how_to/ are also executable .mon files that serve as both documentation and integration tests:

AlgebraicDataTypes.mon — ADT definition and pattern matching
Layouts.mon — Struct layout and field access
Macros.mon — Macro system walkthrough
RefinementType.mon — Refinement type usage
Syntax.mon — Syntax comparison (S-expr vs Wisp)
Test.mon — Test framework usage
RaylibSpeedrun.mon — Real-world raylib application
Term/ — Dependent type examples

12. Fuzzing System

A deterministic QuickCheck-style fuzzing harness for the compiler, located at tests/fuzzing/. The runner (tests/fuzzing/fuzz_codegen.py, 2369 lines) loads structured property files from a recursive property corpus, generates typed expressions, compiles them with monad, runs the resulting binary, and checks stdout against expected values.

Property corpus (tests/fuzzing/properties/): 1933+ active .fuzz properties organized in a hierarchical directory taxonomy across 15 top-level categories:

Directory	Focus	Count
`00-core/`	Arithmetic, comparison, order, lattice	~300
`10-control/`	if/cond/while/for, phi lowering, loop mutation	~100
`20-binding/`	Env, `with`, `set!`, shadowing, SSA	~100
`20-functions/`	Function dispatch, arity	~80
`30-codegen-dispatch/`	Operator dispatch, strategy matrix	~250
`35-compiler-path-v11/`	Compiler-path targeted laws	~150
`40-runtime-collections/`	Arrays, lists, sets, mutation	~200
`40-runtime-values/`	Runtime value construction	~80
`50-quote-null-path/`	Quote, null, path, predicate, cast	~100
`60-cross-feature/`	Cross-feature compiler paths	~150
`70-oracle/`	Python oracles (future)	~50
`75-experimental/`	Unstable feature probes	~50
`80-stress/`	Large/many-form programs	~50
`90-negative/`	Expected-compile-fail invariants	~200
`99-misc/`	Everything else	~50

Each .fuzz file is a self-contained property specification with optional metadata:

name: int_add_commutative
section: arithmetic
tier: stable
kind: law
args: a:Int b:Int
features: arithmetic, equality, logic
compiler-path: codegen.arithmetic_dispatch, codegen.equality_dispatch
type: Bool
expect: True
description: Addition must be commutative for generated Int expressions.
law: (= (fuzz-add {a} {b}) (fuzz-add {b} {a}))

Properties support 5 tiers (stable, experimental, oracle, compile-fail, stress), 3 kinds (law, program, compile-fail), strategy-annotated generators (x:Int@edge+if), Python oracles (expect-python:), and variadic = lowering via the built-in rewriter.

Type-directed generators produce typed Int, Bool, String, Char, Float, Arr, List, and dependent types (IndexOf, SubstrOf, ElemOf, DifferentFrom) with strategy modifiers:

Strategy	Effect
`tiny`	Small magnitude values
`nat`/`index`	Non-negative small values
`edge`	Edge-case values (-128, -1, 0, 1, 127, etc.)
`bitwise`	Include bitwise operators
`if`/`phi`	Bias toward `if` expressions
`with`/`env`	Bias toward `with` bindings
`array`	Include array-index expressions

Runner features:

Recursive .fuzz discovery (1933+ active, 23 disabled)
Coverage mode: one generated instance per property
Random batch mode: mixes random laws into programs
Shrink-on-failure: isolates minimal counterexample via simpler_exprs
Failure isolation: breaks failed batches into individual law runs
Regression tracking: persists history to .fuzz-history.json
Emacs-compatible output: path:line:col: for compilation-mode navigation
Styled output matching tests/run.py (Unicode dividers, colorized durations)

Fuzzer self-tests (tests/fuzzing/fuzz_fuzzing.py) exercise the generation and rewrite logic in isolation with no compiler binary required:

python3 -m unittest tests.fuzzing.fuzz_fuzzing -v

Run the fuzzer via Make:

make fuzzing

Or directly with various options:

python3 tests/fuzzing/fuzz_codegen.py --seed 41 --cases 1 --keep
python3 tests/fuzzing/fuzz_codegen.py --properties int_add_commutative
python3 tests/fuzzing/fuzz_codegen.py --tiers all
python3 tests/fuzzing/fuzz_codegen.py --list-properties

See tests/fuzzing/README.md for the complete reference.

13. Build System

make              # Debug build (default)
make release      # Release build (-O2)
make asan         # Address sanitizer build
make install      # Install to /usr/local
make uninstall    # Remove installed files
make clean        # Remove build artifacts

Dependencies: gcc, llvm-dev, libclang-dev, libreadline-dev, libgmp-dev, pkg-config.

The static runtime libmonad.a (archive of runtime.o + arena.o) is linked into every binary. No shared library or ldconfig needed.

14. Known Limitations

These are documented architectural gaps — not bugs in the test suite:

Area	Issue	Root Cause
Closures	`fn`/`lambda`/`defn` not callable	No strcmp dispatch handler; no closure allocation in codegen
Lists	`tail` segfaults	Runtime list implementation bug
Sets	`empty?` returns False on `{}`	Runtime set predicate bug
Maps	`#{"k" v}` produces garbage without `:: Map`	Runtime map construction bug
Map ops	`assoc`/`dissoc`/`merge`/`keys`/`vals`/`find`	Produce garbage via map runtime
Dep phase	`concat`/`substring`/`reverse`/`nth`/`length`/`reduce`	Register_builtins-only — lack strcmp dispatch
Higher-order	`map`/`filter`/`apply`/`partial`	Require closures
Cond	`cond` form not recognized	Parser treats as ordinary variable
Let	`let`/`let*`/`letrec` not dispatched	No strcmp handler
Wisp cond	`cond` in wisp compiles but produces no output	No strcmp dispatch for `cond`
Wisp let	`let`/`let*` in wisp — body nesting breaks	Wisp expansion doesn't preserve body scoping
Wisp lambda	`lambda`/`fn` at wisp line level — param binding fails	Wisp layout can't nest lambda params
Wisp for	`for` keyword not recognized by wisp expander	Missing from wisp keyword list
Wisp begin	`begin` at wisp line level not callable	Treated as ordinary identifier
Wisp @-sugar	`@`-sugar (indexing shorthand) not implemented	Reader rejects `@` in identifiers
Garbage (wide)	Map, Set, List, String, Symbol, Keyword, Layout all leak through closures	`wrap_func_as_closure` passes ALL raw pointer types through without boxing

15. TODO

See context/todo.org for the full priority-ranked list with confidence levels. Highlights:

Full dependent type elaboration pipeline (surface → core ITT → codegen)
Refinement type static predicate evaluation integration
Type class instance derivation (Eq, Enum, Ord, Bounded)
Closure support (fn/lambda/defn)
Wisp syntax parity with s-expression forms
Context visualizer performance optimization (60fps with 1258+ nodes)
THINK record type for speculative reasoning

16. Contributing & License

Every feature, bug fix, or test addition should be accompanied by context records explaining why it exists and what it verifies.

Key principles:

Smallest atoms — Each test proves exactly one behavior.
Context-linked — Every test links to the context nodes that explain it.
No fix without coverage — Bug fixes require a test atom first.
Append, don't rewrite — Use superseding records instead of silently changing history.

This project is licensed under the MIT License — see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
context		context
core-frozen		core-frozen
core		core
etc		etc
how_to		how_to
old-core		old-core
tests		tests
.gitignore		.gitignore
Main.mon		Main.mon
Makefile		Makefile
README.md		README.md
TODO.org		TODO.org
arena.c		arena.c
arena.h		arena.h
asm.c		asm.c
asm.h		asm.h
buildsystem.c		buildsystem.c
buildsystem.h		buildsystem.h
bytecode.c		bytecode.c
bytecode.h		bytecode.h
cli.c		cli.c
cli.h		cli.h
codegen.c		codegen.c
codegen.h		codegen.h
completion.c		completion.c
completion.h		completion.h
config.c		config.c
config.h		config.h
context-visualizer.py		context-visualizer.py
create_asm_tests.py		create_asm_tests.py
create_asm_tests_extra.py		create_asm_tests_extra.py
debugger.c		debugger.c
debugger.h		debugger.h
dep.c		dep.c
dep.h		dep.h
env.c		env.c
env.h		env.h
features.c		features.c
features.h		features.h
ffi.c		ffi.c
ffi.h		ffi.h
infer.c		infer.c
infer.h		infer.h
lsp.c		lsp.c
lsp.h		lsp.h
lsp_repl.c		lsp_repl.c
lsp_repl.h		lsp_repl.h
macro.c		macro.c
macro.h		macro.h
main.c		main.c
make		make
module.c		module.c
module.h		module.h
module_export.c		module_export.c
module_export.h		module_export.h
optimizations.c		optimizations.c
optimizations.h		optimizations.h
pmatch.c		pmatch.c
pmatch.h		pmatch.h
reader.c		reader.c
reader.h		reader.h
repl.c		repl.c
repl.h		repl.h
runtime.c		runtime.c
runtime.h		runtime.h
typeclass.c		typeclass.c
typeclass.h		typeclass.h
types.c		types.c
types.h		types.h
typst_emit.c		typst_emit.c
typst_emit.h		typst_emit.h
wisp.c		wisp.c
wisp.h		wisp.h

Folders and files

Latest commit

History

Repository files navigation

A Pure Functional Language from High-Level Abstraction to Bare Metal

1. Repository Map

2. Philosophy

3. Compiler Pipeline (11 phases)

4. Source File Guide

4.1 Reader & Parser (reader.c/h, wisp.c/h)

4.2 Type System

4.3 Codegen (codegen.c/h)

4.4 Runtime (runtime.c/h, arena.c/h)

4.5 FFI and Inline Assembly (ffi.c/h, asm.c/h)

4.6 Module System (module.c/h, module_export.c/h, buildsystem.c/h)

4.7 Other Compiler Modules

5. Syntax: S-Expressions and Wisp

Wisp Sugar Notes

6. Type System (Reference)

6.1 Hindley-Milner Inference

6.2 Dependent Types (ITT)

6.3 Refinement Types

6.4 Type Classes

7. Quick Start

8. CLI Reference

Modes

Flags

9. Language Tour

Variables and Built-in Types

Functions

Pattern Matching

Collections

Maps, Sets, and Relations

Layouts (Structs)

Conditional Compilation

Per-Module Test Suites

10. Context System

10.1 Category Theory Foundation

10.2 Context Visualizer

10.3 Test Metadata

10.4 Context File Index

10.5 Context Node Documentation

11. Test Infrastructure

12. Fuzzing System

13. Build System

14. Known Limitations

15. TODO

16. Contributing & License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

4.1 Reader & Parser (`reader.c/h`, `wisp.c/h`)

4.3 Codegen (`codegen.c/h`)

4.4 Runtime (`runtime.c/h`, `arena.c/h`)

4.5 FFI and Inline Assembly (`ffi.c/h`, `asm.c/h`)

4.6 Module System (`module.c/h`, `module_export.c/h`, `buildsystem.c/h`)

Packages