Skip to content

Commit 4ba0ccd

Browse files
TimDettmersclaude
andcommitted
feat: Add coordinator agent workflow and GitHub tools guide
Coordinator guide defines the full workflow for analyzing open issues, generating thorough prompt files for worker agents, and launching parallel sessions via worktrees. Includes standard completion workflow (commit, push, PR, Slack notification) for worker agents. GitHub tools guide covers bitsandbytes-specific usage of the lab_tools issue tracker scripts for finding, filtering, and analyzing issues. Co-Authored-By: Claude Opus 4.6 <[email protected]>
1 parent c1666aa commit 4ba0ccd

3 files changed

Lines changed: 437 additions & 0 deletions

File tree

CLAUDE.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
# Coordinating agent work on GitHub issues
2+
3+
To analyze open issues, generate prompts, and launch parallel worker agents, follow `agents/coordinator_guide.md`. This uses the GitHub issue tools in `~/git/lab_tools/github/` — see `agents/github_tools_guide.md` for the bitsandbytes-specific reference.
4+
15
# Parallel sessions
26

37
To work on multiple branches at once, use git worktrees:

agents/coordinator_guide.md

Lines changed: 286 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,286 @@
1+
# Coordinator Agent Guide
2+
3+
You are a coordinator agent. Your job is to analyze open GitHub issues for bitsandbytes, identify issues that can be worked on by autonomous agent sessions, and generate detailed prompt files and launch commands for those agents.
4+
5+
## Prerequisites
6+
7+
Before starting, refresh the issue data:
8+
9+
```bash
10+
python3 ~/git/lab_tools/github/fetch_issues.py
11+
```
12+
13+
Read `agents/github_tools_guide.md` for the full reference on how to use the query tools.
14+
15+
## Step 1: Find Candidate Issues
16+
17+
Start by getting the landscape of open issues:
18+
19+
```bash
20+
python3 ~/git/lab_tools/github/query_issues.py list
21+
python3 ~/git/lab_tools/github/query_issues.py list --sort reactions
22+
```
23+
24+
Look for issues that are actionable — see the "Identifying Actionable Issues" section of `agents/github_tools_guide.md`. Good candidates have:
25+
26+
- Clear reproduction steps or error messages
27+
- A pointer to specific code
28+
- A well-scoped fix (not requiring design decisions)
29+
- No hardware requirements you can't meet
30+
31+
Also check for low-hanging fruit:
32+
33+
```bash
34+
# Issues with open PRs that may just need review/testing/completion
35+
python3 ~/git/lab_tools/github/query_issues.py search "PR" --state open
36+
37+
# Issues already labeled for external contribution
38+
python3 ~/git/lab_tools/github/query_issues.py list --label "Contributions Welcome"
39+
40+
# Issues proposed for closing (may just need verification)
41+
python3 ~/git/lab_tools/github/query_issues.py list --label "Proposing to Close"
42+
```
43+
44+
## Step 2: Deep-Dive Each Candidate
45+
46+
For each candidate issue, gather full context. This step is critical — the quality of the prompt file depends on how thoroughly you understand the issue.
47+
48+
```bash
49+
# Full issue with all comments
50+
python3 ~/git/lab_tools/github/query_issues.py show <NUMBER>
51+
52+
# Find related/duplicate issues (with body previews and last comments)
53+
python3 ~/git/lab_tools/github/query_issues.py related <NUMBER> -v
54+
55+
# Check if it was already resolved
56+
python3 ~/git/lab_tools/github/query_issues.py related <NUMBER> --state closed -v
57+
58+
# Targeted searches for specific error messages or terms from the issue
59+
python3 ~/git/lab_tools/github/query_issues.py search "specific error text"
60+
```
61+
62+
For each promising related issue that shows up, run `show` on it to get the full context. Don't stop at the `related` output — read the full body and comments of related issues, especially closed ones where the resolution may be documented.
63+
64+
For each issue, determine:
65+
66+
1. **What is the root cause?** Read the full body, comments, and tracebacks.
67+
2. **Has this been fixed before?** Check related closed issues for prior fixes.
68+
3. **Is there an existing PR?** Check cross-references in the `show` output.
69+
4. **What files need to change?** Look for code pointers in the issue body and comments. If possible, read the actual source files in the bitsandbytes repo to verify.
70+
5. **How do we verify the fix?** Is there a reproduction script? What tests apply?
71+
6. **What patterns or context from other issues are relevant?** Maybe three other issues report the same error with different trigger conditions. Maybe a closed issue's fix didn't fully address the problem. This broader context is valuable for the worker agent.
72+
73+
## Step 3: Generate Prompt Files
74+
75+
For each issue you decide to assign to a worker agent, write a prompt file to `/tmp/bnb-agents/`. Create the directory first:
76+
77+
```bash
78+
mkdir -p /tmp/bnb-agents
79+
```
80+
81+
Write each prompt file using the Write tool. The file name should be `issue-<NUMBER>.md`.
82+
83+
### Prompt File Principles
84+
85+
**Thorough and self-contained.** The worker agent starts with zero context. Everything it needs must be in this file. Err on the side of including too much rather than too little.
86+
87+
**Include raw data, don't summarize it.** The worker agent needs to see the exact error messages, tracebacks, reproduction code, and comment discussions — not your summary of them. Include the full `show` output for the target issue and for key related issues. The worker agent may notice details that you didn't.
88+
89+
**Add your own analysis on top of the raw data.** After the raw data sections, include your synthesis: what you think the root cause is, how the issues relate to each other, which files need to change, what approach makes sense, what pitfalls to avoid. This is the value you add as coordinator — the worker gets both primary sources AND your analysis.
90+
91+
**Include all context you gathered, even tangential findings.** If you discovered during your deep-dive that a related closed issue was fixed by a specific commit, or that five other open issues are symptoms of the same root cause, or that a maintainer commented on a related issue with a relevant technical detail — include that. The worker agent benefits from the full picture, not just the narrow scope of the single issue.
92+
93+
### Prompt File Structure
94+
95+
Every prompt file should have these sections:
96+
97+
**1. Setup instructions.** The exact commands to create a worktree, plus a pointer to build/test docs:
98+
99+
```markdown
100+
## Setup
101+
102+
Create your working environment by running these commands:
103+
104+
cd ~/git/bitsandbytes
105+
git worktree add ~/git/bnb-fix-<NUMBER> -b fix/issue-<NUMBER>
106+
cd ~/git/bnb-fix-<NUMBER>
107+
108+
Read agents/testing_guide.md for build and test instructions. Build the
109+
project before making changes so you can verify your setup works.
110+
```
111+
112+
**2. The target issue — full context.** Include the complete output from `show <NUMBER>`. This means the full issue body (with all error messages, code blocks, tracebacks), all comments (with author and date), cross-references, labels, and reactions. Do not truncate or summarize.
113+
114+
**3. Related issues — full context.** For each related issue that you identified during your deep-dive, include the full `show` output or a thorough excerpt. For closed issues, the comments often contain the resolution — make sure those are included. Explain how each related issue connects to the target issue.
115+
116+
**4. Additional context from your analysis.** This is where you include everything else you discovered:
117+
118+
- Patterns across multiple issues (e.g. "Issues #933, #966, #1190, #1394, and #1434 all report the same CUDA Setup failure with different CUDA versions — the root cause appears to be X")
119+
- Relevant technical details from maintainer comments on other issues
120+
- Source code observations if you read the bitsandbytes source
121+
- Information about existing PRs — what they change, whether they look correct
122+
- Anything else the worker agent should know
123+
124+
**5. Your recommended approach.** What you think the fix should look like. Be specific — name files, functions, line numbers. Frame it as guidance, not commands — the worker agent may find things you didn't and should use its own judgment.
125+
126+
**6. Completion workflow.** Every prompt file must include this section verbatim, with the issue number filled in:
127+
128+
```markdown
129+
## When You Are Done
130+
131+
After implementing and verifying the fix:
132+
133+
1. **Commit** your changes with a message referencing the issue:
134+
135+
git add <files>
136+
git commit -m "Fix <brief description> (#<NUMBER>)"
137+
138+
2. **Push** the branch:
139+
140+
git push -u origin fix/issue-<NUMBER>
141+
142+
3. **Create a pull request** with `gh pr create`. The PR body must
143+
include "Fixes #<NUMBER>" so GitHub auto-links and auto-closes the
144+
issue on merge. Describe what the fix does and how you verified it.
145+
146+
4. **Post to the bitsandbytes Slack channel** to notify the team.
147+
Write a temporary Python script to `/tmp/slack_notify.py` and run it:
148+
149+
import json, urllib.request, sys
150+
151+
TOKEN = open("/home/tim/Dropbox/Cloud/api_keys/slack_bot.txt").read().strip()
152+
data = {"channel": "C0AF43L9BT6", "text": "<your message>"}
153+
req = urllib.request.Request(
154+
"https://slack.com/api/chat.postMessage",
155+
data=json.dumps(data).encode(),
156+
headers={"Authorization": f"Bearer {TOKEN}", "Content-Type": "application/json"},
157+
)
158+
resp = json.loads(urllib.request.urlopen(req).read())
159+
if not resp.get("ok"):
160+
print(f"ERROR: {resp.get('error')}", file=sys.stderr)
161+
162+
The message should include: which issue you fixed, a one-line
163+
description of the fix, and the PR URL. Keep it concise.
164+
165+
Then delete the script: `rm /tmp/slack_notify.py`
166+
167+
If tests are failing and you cannot resolve the failures, still commit,
168+
push, and create the PR — but note the failures in the PR description
169+
and explain what you tried. Do not silently abandon work.
170+
```
171+
172+
**7. What NOT to do.** If there are traps, scope boundaries, or things that look tempting but are wrong, list them explicitly. For example: "Don't change the 8bit_blockwise dispatch — only the 32bit dispatch is affected."
173+
174+
### Example Prompt File
175+
176+
Below is an abbreviated example showing the structure and level of detail. A real prompt file will be longer because it includes the full raw data from `show` outputs.
177+
178+
```markdown
179+
## Setup
180+
181+
Create your working environment:
182+
183+
cd ~/git/bitsandbytes
184+
git worktree add ~/git/bnb-fix-1810 -b fix/issue-1810
185+
cd ~/git/bnb-fix-1810
186+
187+
Read agents/testing_guide.md for build and test instructions.
188+
189+
## Issue #1810: LARS missing in str2optimizer32bit
190+
191+
Author: RasmusHoier | Created: 2025-11-18 | Labels: Optimizers
192+
Cross-references: PR #1855 [OPEN]: Add LARS to str2optimizer32bit dictionary
193+
194+
### Full Issue Body
195+
196+
[the entire body from `show 1810`, including the System Info section,
197+
the full error traceback, the user's analysis pointing to
198+
bitsandbytes/backends/cuda/ops.py, the reproduction script, and the
199+
related issues the user linked]
200+
201+
### Comments
202+
203+
[1] @matthewdouglas (2025-11-18) | THUMBS_UP:1:
204+
[the full comment text about LARS reusing Momentum kernels and
205+
LAMB reusing Adam kernels, and the note about 8bit blockwise
206+
also being missing]
207+
208+
## Related Issues
209+
210+
### #1281 (CLOSED): NameError: name 'str2optimizer32bit' is not defined
211+
212+
This was a different problem — the diagnostic script `python -m bitsandbytes`
213+
was failing because `str2optimizer32bit` was not imported in the diagnostics
214+
module. Not the same issue as #1810, but the name overlap means keyword
215+
search will surface it.
216+
217+
[full show output for #1281]
218+
219+
### #1403 (OPEN, Duplicate): unable to run FSDP2 with low bit optimizers
220+
221+
Labeled as Duplicate. Reports a traceback when using Adam 8-bit with FSDP2.
222+
Different root cause from #1810 but same area of the codebase.
223+
224+
## Additional Context
225+
226+
The maintainer @matthewdouglas confirmed in the comment on #1810 that:
227+
- LARS should reuse the Momentum kernel implementations
228+
- LAMB already maps to Adam kernels (this is the pattern to follow)
229+
- Both LARS and LAMB are missing 8bit blockwise implementations, but that
230+
is out of scope for this fix
231+
232+
PR #1855 already exists and claims to add LARS to the dictionary. Check
233+
whether it is correct and complete before implementing from scratch.
234+
235+
## Recommended Approach
236+
237+
1. Open `bitsandbytes/backends/cuda/ops.py` and find the `str2optimizer32bit`
238+
dictionary (around line 543-577 based on the version the reporter linked).
239+
2. Add a `"lars"` entry mapping to the momentum kernel functions, following
240+
the pattern of how `"lamb"` maps to the adam kernels.
241+
3. Fix the error message at ~line 635 that incorrectly displays
242+
`str2optimizer8bit_blockwise` keys instead of `str2optimizer32bit` keys.
243+
4. Check PR #1855 first — if it already does this correctly, you can verify
244+
and build on it rather than reimplementing.
245+
246+
## When You Are Done
247+
248+
[the standard completion workflow section with issue number 1810 filled in]
249+
250+
## What NOT to Do
251+
252+
- Don't modify the 8bit_blockwise dispatch — that's a separate issue.
253+
- Don't add LARS to 8bit blockwise even though it's also missing there.
254+
The maintainer acknowledged this but it's out of scope for #1810.
255+
- Don't change test files unless the existing tests are actually wrong.
256+
```
257+
258+
## Step 4: Output Launch Commands
259+
260+
After writing all prompt files, output the launch commands. Each command tells the human which issue it's for and gives the exact `claude` command to run:
261+
262+
```
263+
## Launch Commands
264+
265+
Issue #1810 — LARS missing in str2optimizer32bit:
266+
claude "Please read /tmp/bnb-agents/issue-1810.md and follow the instructions."
267+
268+
Issue #919 — Noisy logs:
269+
claude "Please read /tmp/bnb-agents/issue-919.md and follow the instructions."
270+
```
271+
272+
The human will run each command in a separate terminal. The worker agent will read the prompt file, create its own worktree, and begin work autonomously.
273+
274+
## Guidelines
275+
276+
- **Be selective.** Don't generate prompts for every open issue. Focus on issues where an agent can realistically make progress without human guidance. 3-5 well-chosen issues are better than 15 marginal ones.
277+
278+
- **Prioritize impact.** Prefer issues with more community demand (reactions, comments), maintainer priority labels, or those blocking other work.
279+
280+
- **Check for existing PRs.** If a PR already exists, the worker agent's job might be to review, test, and complete it rather than starting from scratch. Say this explicitly in the prompt.
281+
282+
- **Don't assign hardware-specific issues** unless you know the hardware is available. ROCm issues need an AMD GPU, Ascend issues need Huawei hardware, etc.
283+
284+
- **Each prompt must be self-contained.** The worker agent has no knowledge of your analysis session. Everything it needs must be in the prompt file.
285+
286+
- **More context is better.** When in doubt, include it. The worker agent can skip what it doesn't need, but it can't recover information you left out.

0 commit comments

Comments
 (0)