Skip to content

Commit 84219c5

Browse files
TimPietruskyRunPodcursoragentmax4c
authored
feat: expand runpodctl coverage with legacy-safe cli (#229)
* refactor(cli): restructure commands to noun-verb pattern and add resource groups - rename module from runpodctl to runpod and binary to runpod - change command structure from verb-noun to noun-verb pattern - add new resource command groups: serverless, template, volume, registry - add legacy command support for backward compatibility - create internal api layer with rest client and resource methods - add output formatting support (json/yaml/table) - enhance ssh key management and commands - update dependencies (cobra v1.8.1, viper v1.19.0) - update documentation and add agents.md for ai tooling - remove deprecated test files and consolidate version handling * refactor(cli): rename volume to network-volume and remove table output - rename volume command to network-volume with alias 'nv' - remove table output format, keep only json and yaml - hide deprecated project command - update help text and command descriptions - simplify api test responses to use direct arrays - update all tests to reflect command rename and output changes * feat(cli): add info commands and pod restart/reset operations - add user command to show account info and balance (alias: me, account) - add gpu command to list available gpu types - add datacenter command to list datacenters (alias: dc) - add billing command with subcommands for pods, serverless, network-volume - add doctor command to diagnose and fix cli configuration issues - add pod restart and reset commands - refactor legacy commands to use actual commands with deprecation warnings - improve config handling with explicit viper value setting - update error messages to guide users to doctor command - deprecate config command in favor of doctor - update default ssh key name to RunPod-Key-Go - add comprehensive e2e tests for all new commands - update root help text with getting started instructions * feat(cli): add shell completion with auto-detection - add completion command that auto-detects shell (bash, zsh, fish, powershell) - automatically install completion to appropriate config files - disable default cobra completion command - add hidden generate subcommand for advanced usage - register completion and update commands in root - update comment formatting in update command * docs(cli): update help text and version to clarify runpod v2 - update root help text to mention runpod v2 (formerly runpodctl) - update version template to include formerly runpodctl note - update version command output to clarify migration from runpodctl * feat(template): add search command and enhance list filtering - add template search command to search by name or image - add --type flag to filter templates (official, community, user) - add --limit and --offset flags for pagination - add --all flag to include user templates - update list command with smart defaults (show all for official/user, limit for community) - add graphql support for fetching official and community templates - enhance gettemplate to try rest api first, fallback to graphql - add template type constants and list options struct - add comprehensive e2e tests for filtering and pagination * feat(cli): expand pod, template, model, and ssh support - add cpu pod creation with compute-type and gpu-type-id naming - include ssh info and lifecycle fields in pod get output - return template readme/env/ports on get and parse graphql ports - promote model repo commands and add legacy get models mapping - update gpu list output to gpu type id and expand e2e coverage * feat(legacy): deprecate exec and restore legacy commands keep exec as a hidden legacy command that points users to ssh, and restore missing legacy commands for cloud and multi-pod operations. * fix(legacy): keep exec runnable and add project tests preserve backward compatibility for legacy exec while pointing users to ssh. add project create/build test coverage and remove resolved issue notes. * feat(pod): add global networking flag add global-networking to pod create with validation and error hints, wire the rest request, add e2e coverage, and update the issue note. * feat(pod): add public ip filter add --public-ip for community cloud pods, wire supportPublicIp, add unit and e2e coverage, and document testing expectations. * style(help): normalize help text standardize command help text casing and plurals, and add unit + e2e checks to prevent regressions. * fix(cli): keep runpodctl as primary binary avoid breaking existing users by reverting the cli name to runpodctl. align docs, tooling, and tests so install and update paths keep the same binary. * test(e2e): cover legacy and cli entrypoints add legacy command assertions and help coverage for remaining cli entrypoints. include ssh, completion, and send/receive checks to catch regressions. * ci(workflow): install govulncheck install govulncheck in ci and call it directly to keep the build green. * docs(agents): add cli pitfalls document non-obvious pitfalls to avoid regressions in templates, legacy, and doctor. * feat(pod): require template-id flag use --template-id for pod creation to align with serverless and clarify intent. update help text and e2e coverage accordingly. * feat(cli): rename gpu id fields rename gpu-type-id flag to gpu-id and normalize gpuTypeId output to gpuId. align billing filters/grouping and update tests/docs; remove resolved docs/issues. * feat(cli): refresh help and docs simplify root help messaging and regenerate markdown docs for the new cli. update pr template to match the new summary/testing guidance. * refactor(module): align module path with repo update go.mod module path to github.com/runpod/runpodctl and rewrite imports to match; includes gofmt cleanup after the path change. * chore(docs): untrack cli restructure justification remove the cli restructure justification from the repo while keeping it locally. Co-authored-by: Cursor <[email protected]> * feat(pod): add --ssh flag to pod create use graphql api for gpu pod creation to support startSsh field, which controls whether ssh keys are injected into the pod. cpu pods fall back to rest api since graphql requires gpuTypeId. * fix(legacy): add missing model commands to legacy wrappers create model and remove model were not registered in the legacy command layer, causing flags like --name and --owner to be unknown. * fix(model): remove unnecessary hash from model repo uploads model repo uploads do not accept a hash, causing errors when specified. also bumps go version to 1.25.7. cherry-picked from #230. * fix(cli): restore model upload timeout fallback and prevent legacy create panic --------- Co-authored-by: Cursor <[email protected]> Co-authored-by: max4c <[email protected]>
1 parent 5ea18ac commit 84219c5

176 files changed

Lines changed: 13490 additions & 700 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/ci.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,15 @@ jobs:
1212
# You can test your matrix by printing the current Go version
1313
- name: Display Go version
1414
run: go version
15+
- name: Install govulncheck
16+
run: |
17+
go install golang.org/x/vuln/cmd/govulncheck@latest
18+
echo "$(go env GOPATH)/bin" >> "$GITHUB_PATH"
1519
- name: run vet
1620
run: go vet ./...
1721
- name: build all packages
1822
run: go build ./...
1923
- name: check for vulnerabilities
20-
run: go tool govulncheck ./...
24+
run: govulncheck ./...
2125
- name: Run tests
2226
run: go test -vet=off --cover ./...

AGENTS.md

Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# AGENTS.md
2+
3+
runpodctl cli: command-line tool for managing gpu pods, serverless endpoints, and developing serverless applications on runpod.
4+
5+
## codebase structure
6+
7+
```
8+
runpod/
9+
├── main.go # entry point, version injection
10+
├── cmd/ # cli commands (cobra)
11+
│ ├── root.go # root command, config init
12+
│ ├── config.go # api key & ssh config
13+
│ ├── ssh.go # ssh key management & connections
14+
│ ├── pod/ # pod commands
15+
│ │ ├── pod.go # parent command
16+
│ │ ├── list.go # list pods
17+
│ │ ├── get.go # get pod by id
18+
│ │ ├── create.go # create pod
19+
│ │ ├── update.go # update pod
20+
│ │ ├── start.go # start pod
21+
│ │ ├── stop.go # stop pod
22+
│ │ └── delete.go # delete pod
23+
│ ├── serverless/ # serverless endpoint commands (alias: sls)
24+
│ │ ├── serverless.go # parent command
25+
│ │ ├── list.go # list endpoints
26+
│ │ ├── get.go # get endpoint
27+
│ │ ├── create.go # create endpoint
28+
│ │ ├── update.go # update endpoint
29+
│ │ └── delete.go # delete endpoint
30+
│ ├── template/ # template commands (alias: tpl)
31+
│ │ └── ...
32+
│ ├── volume/ # network volume commands (alias: vol)
33+
│ │ └── ...
34+
│ ├── registry/ # container registry auth (alias: reg)
35+
│ │ └── ...
36+
│ ├── transfer/ # file transfer (croc)
37+
│ │ ├── transfer.go # send/receive commands
38+
│ │ ├── croc.go # croc implementation
39+
│ │ └── rtt.go # relay rtt testing
40+
│ ├── project/ # serverless project workflow
41+
│ │ ├── project.go # create, dev, deploy, build
42+
│ │ ├── functions.go # project lifecycle logic
43+
│ │ └── starter_examples/ # template projects
44+
│ └── legacy/ # deprecated command aliases
45+
│ └── legacy.go # backwards compatibility
46+
├── internal/
47+
│ ├── api/ # api clients
48+
│ │ ├── client.go # rest client
49+
│ │ ├── pods.go # pod api methods
50+
│ │ ├── endpoints.go # endpoint api methods
51+
│ │ ├── templates.go # template api methods
52+
│ │ ├── volumes.go # volume api methods
53+
│ │ ├── registry.go # registry auth methods
54+
│ │ └── graphql.go # graphql client (fallback)
55+
│ └── output/ # output formatting
56+
│ └── output.go # json/yaml/table output
57+
├── docs/ # generated documentation
58+
└── .goreleaser.yml # release configuration
59+
```
60+
61+
## key technologies
62+
63+
- **go 1.24** with modules
64+
- **cobra** — cli framework
65+
- **viper** — configuration management
66+
- **croc** — peer-to-peer file transfer (no api key required)
67+
- **rest api** — primary api (https://rest.runpod.io/v1)
68+
- **graphql** — fallback for features rest lacks
69+
70+
## configuration
71+
72+
- config file: `~/.runpod/config.toml`
73+
- api key via: `runpodctl config --apiKey=xxx`
74+
- environment override: `RUNPOD_API_KEY`, `RUNPOD_API_URL`
75+
76+
## build commands
77+
78+
```bash
79+
# local development build
80+
make local
81+
# output: bin/runpod
82+
83+
# cross-platform release builds
84+
make release
85+
# outputs: bin/runpod-{os}-{arch}
86+
87+
# run tests
88+
go test ./...
89+
```
90+
91+
## command structure
92+
93+
commands follow noun-verb pattern: `runpodctl <resource> <action>`
94+
95+
| command | description |
96+
|---------|-------------|
97+
| `runpodctl pod list` | list all pods |
98+
| `runpodctl pod get <id>` | get pod by id |
99+
| `runpodctl pod create --image=<img>` | create a pod |
100+
| `runpodctl pod start <id>` | start a stopped pod |
101+
| `runpodctl pod stop <id>` | stop a running pod |
102+
| `runpodctl pod delete <id>` | delete a pod |
103+
| `runpodctl serverless list` | list endpoints (alias: sls) |
104+
| `runpodctl serverless get <id>` | get endpoint details |
105+
| `runpodctl template list` | list templates (alias: tpl) |
106+
| `runpodctl volume list` | list network volumes (alias: vol) |
107+
| `runpodctl registry list` | list registry auths (alias: reg) |
108+
| `runpodctl send <file>` | send file via croc |
109+
| `runpodctl receive <code>` | receive file via croc |
110+
| `runpodctl ssh list-keys` | list account ssh keys |
111+
| `runpodctl ssh connect <pod>` | show ssh connect command |
112+
| `runpodctl project create` | create serverless project |
113+
| `runpodctl project dev` | start dev session |
114+
| `runpodctl project deploy` | deploy as endpoint |
115+
| `runpodctl config --apiKey=xxx` | configure api key |
116+
117+
## output format
118+
119+
default output is json (for agents). use `--output=table` for human-readable format.
120+
121+
```bash
122+
runpodctl pod list # json output
123+
runpodctl pod list --output=table # table output
124+
runpodctl pod list --output=yaml # yaml output
125+
```
126+
127+
## where to make changes
128+
129+
| task | location |
130+
|------|----------|
131+
| add new rest api operation | `internal/api/` |
132+
| add new cli command | `cmd/<resource>/` |
133+
| modify pod commands | `cmd/pod/` |
134+
| modify serverless commands | `cmd/serverless/` |
135+
| add project template | `cmd/project/starter_examples/` |
136+
| change file transfer | `cmd/transfer/` |
137+
| update ssh logic | `cmd/ssh.go` |
138+
| modify build/release | `makefile`, `.goreleaser.yml` |
139+
140+
## api layer pattern
141+
142+
rest api operations in `internal/api/`:
143+
1. define request/response structs
144+
2. call appropriate http method (Get, Post, Patch, Delete)
145+
3. parse json response
146+
4. return typed result or error
147+
148+
graphql fallback in `internal/api/graphql.go` for features rest doesn't support (ssh keys, detailed pod info).
149+
150+
## pitfalls (non-obvious)
151+
152+
- templates are dual-source: official/community via graphql, user via rest; list/search merge results and apply search/pagination client-side; graphql failures are intentionally best-effort.
153+
- graphql template shapes are inconsistent: `ports` may be string or array, `env` is key/value pairs; normalize before output and only return `readme/env/ports` on `template get`.
154+
- `doctor` is the only mutating setup path (api key + ssh sync); onboarding/ssh changes must update both `cmd/doctor` and `internal/sshconnect` hints.
155+
- legacy commands must preserve stdout and behavior exactly; deprecation warnings go to stderr only (exec is the most common regression).
156+
- `cmd/project.go` is not wired into the cli; the hidden `project` command is created in `cmd/root.go` and wraps `cmd/project/*`.
157+
- api accepts `gpuTypeIds` arrays, but the cli is intentionally singular (`--gpu-id`); multi-id fallback must be an explicit new flag.
158+
159+
## important notes
160+
161+
- **never start/stop servers** — user handles that
162+
- file transfer (`send`/`receive`) works without api key
163+
- version is injected at build time via `-ldflags`
164+
- config auto-migrates from `~/.runpod.yaml` to `~/.runpod/config.toml`
165+
- ssh keys are auto-generated and synced to account on `config` command
166+
- all text output is lowercase and concise
167+
- default output format is json for agent consumption
168+
- always add unit + e2e tests for new behavior
169+
- e2e tests must clean up resources they create (use `t.Cleanup`)

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
See [AGENTS.md](./AGENTS.md) for project documentation.

0 commit comments

Comments
 (0)