Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
112 commits
Select commit Hold shift + click to select a range
86a6b1b
feat(config): expose TUI editability audit
Hmbown Jun 20, 2026
3441a85
test(subagent): wait for launch gate acquisition
Hmbown Jun 21, 2026
6c0b71f
benchmarks: harden terminal bench environments
Hmbown Jun 21, 2026
7b4d291
chore: clean public release surfaces
Hmbown Jun 21, 2026
e8905f7
refactor(config): move inline tests to module
cyq1017 Jun 21, 2026
4e5ab73
refactor(tui): move config inline tests to module
Hmbown Jun 21, 2026
e0a2e28
refactor(tui): move runtime api inline tests
Hmbown Jun 21, 2026
312ac97
refactor(tui): move runtime thread inline tests
Hmbown Jun 21, 2026
5583cbd
refactor(tui): move history inline tests
Hmbown Jun 21, 2026
1f00e08
refactor(tui): move app inline tests
Hmbown Jun 21, 2026
5e22777
refactor(tui): move mcp inline tests
Hmbown Jun 21, 2026
8f224d6
refactor(tui): move history tool run grouping
Hmbown Jun 21, 2026
29eb468
refactor(tui): move archived context rendering
Hmbown Jun 21, 2026
5113feb
refactor(tui): move plan history renderer
Hmbown Jun 21, 2026
118b957
refactor(tui): move checklist history renderer
Hmbown Jun 21, 2026
e8a43ab
refactor(tui): move thinking history renderer
Hmbown Jun 21, 2026
cea2ee0
refactor(tui): move agent history activity helpers
Hmbown Jun 21, 2026
e4bb504
refactor(tui): move history render constants
Hmbown Jun 21, 2026
529d4ad
fix(tui): allow worktree git metadata writes in sandbox
cyq1017 Jun 21, 2026
f7f3488
chore(tui): gate history test helper imports
Hmbown Jun 21, 2026
b584aa4
refactor(tui): move message history renderer
Hmbown Jun 21, 2026
dbe655f
refactor(tui): move tool output history renderer
Hmbown Jun 21, 2026
6a5b25e
refactor(tui): move tool output summaries
Hmbown Jun 21, 2026
6fb5792
refactor(tui): split MCP header helpers
cyq1017 Jun 21, 2026
b292cd1
style(tui): apply clippy format cleanups
hongqitai Jun 21, 2026
7be9388
chore(tui): move benchmark harnesses out of repo
Hmbown Jun 21, 2026
fd35e1c
fix(cli): add Linux parent-death cleanup for delegated servers
Hmbown Jun 21, 2026
2a73f11
fix(tui): require fresh reads before file edits
Hmbown Jun 21, 2026
4356335
fix(tui): keep project overlays from loosening policy
Hmbown Jun 21, 2026
9a34b50
fix(tui): validate git history revisions
Hmbown Jun 21, 2026
57f3c89
fix(tui): require approval for interactive execution tools
Hmbown Jun 21, 2026
4a29f83
fix(tui): keep runtime auth tokens out of mobile URLs
Hmbown Jun 21, 2026
3dfd7e8
fix(deps): refresh web tooling security locks
Hmbown Jun 21, 2026
26de44a
fix(tui): harden local tool trust boundaries
Hmbown Jun 21, 2026
e2bd334
fix(ci): harden release target setup and tagging
Hmbown Jun 21, 2026
63c9e9c
fix(tui): resolve worktree metadata from subdirectories
Hmbown Jun 21, 2026
7985f6f
fix(config): harden config sibling paths
Hmbown Jun 21, 2026
daf199a
fix(tui): validate subagent state paths
Hmbown Jun 21, 2026
be83a66
fix(tui): reject symlinked runtime store files
Hmbown Jun 21, 2026
4becde9
fix(tui): canonicalize project mcp cwd
Hmbown Jun 21, 2026
d72a74a
fix(tui): require https for fleet alerts
Hmbown Jun 21, 2026
9803bb8
fix(tui): default auto-compaction for known windows
Hmbown Jun 21, 2026
e97462e
fix(tui): stop printing generated runtime tokens
Hmbown Jun 21, 2026
ac33b20
fix(security): reject disabled TLS verification
Hmbown Jun 21, 2026
5b19e84
fix(security): redact provider registry drift output
Hmbown Jun 21, 2026
b4d6604
fix(tui): make review-only turns read-only
Hmbown Jun 21, 2026
8ecc185
feat(tui): add auto-review policy evaluator
Hmbown Jun 21, 2026
45923d7
fix(tui): hold publish-like tool calls for review
Hmbown Jun 21, 2026
c09d366
fix(tui): enforce auto-review for background tools
Hmbown Jun 21, 2026
5c7d48b
fix(tui): load configured auto-review policy
Hmbown Jun 21, 2026
a6d5de8
fix(tui): hold shell publish commands for review
Hmbown Jun 21, 2026
4f7a77d
feat(tui): write pre-push review receipts
Hmbown Jun 21, 2026
8bc3ec0
feat(tui): validate pre-push review receipts
Hmbown Jun 21, 2026
9f68622
docs(config): document auto-review policy
Hmbown Jun 21, 2026
d2c7b03
fix(security): revalidate config and subagent state paths
Hmbown Jun 21, 2026
b331eab
fix(ci): install pinned Rust targets in release workflows
Hmbown Jun 21, 2026
6084ffa
fix(security): validate worktree metadata reciprocity
Hmbown Jun 22, 2026
6f2a14f
fix(security): gate interpreter tools before execution
Hmbown Jun 22, 2026
4a82f72
fix(security): keep rlm eval behind approval
Hmbown Jun 22, 2026
409673a
test(security): reject whitespace git revisions
Hmbown Jun 22, 2026
5d522e0
fix(security): require attached review checks to pass
Hmbown Jun 22, 2026
a97ed1a
fix(security): revalidate config path before save
Hmbown Jun 22, 2026
3699ef3
fix(security): validate MCP init config paths first
Hmbown Jun 22, 2026
007febe
fix(security): validate project state subdirs
Hmbown Jun 22, 2026
3cfc4b8
fix(security): reject symlinked runtime store dirs
Hmbown Jun 22, 2026
fa5108f
fix(security): reject symlinked project context
Hmbown Jun 22, 2026
bcfa750
fix(security): reject symlinked project config
Hmbown Jun 22, 2026
a3b49ee
docs(security): align project config boundaries
Hmbown Jun 22, 2026
a4ee90f
fix(security): harden MCP and subagent state reads
Hmbown Jun 22, 2026
0c9fee4
fix(security): no-follow config permission reads
Hmbown Jun 22, 2026
9eec2a8
fix(security): redact sensitive base URL display
Hmbown Jun 22, 2026
41e4b5c
test(security): avoid cleartext session id flow in runtime test
Hmbown Jun 22, 2026
604ea3e
fix(security): compact exec session breadcrumbs
Hmbown Jun 22, 2026
2c3af11
fix(security): redact session identifiers in diagnostics
Hmbown Jun 22, 2026
eaab63b
test(security): pin fake approval provenance gate
Hmbown Jun 22, 2026
c828b5e
fix(security): compact context inspector session id
Hmbown Jun 22, 2026
137b4ce
fix(ci): remove invalid release parity matrix guard
Hmbown Jun 22, 2026
f8601af
fix(security): make apply_patch rollback directory setup failures
Hmbown Jun 22, 2026
0e634fd
fix(ci): repair v0.8.64 platform checks
Hmbown Jun 22, 2026
22b8411
fix(ci): keep config ui apply hermetic
Hmbown Jun 22, 2026
e2a7f95
fix(ci): align mobile smoke auth contract
Hmbown Jun 22, 2026
cbbe754
fix(security): harden project state path roots
Hmbown Jun 22, 2026
207b404
fix(security): redact sensitive runtime diagnostics
Hmbown Jun 22, 2026
0245412
docs(security): update vulnerability contact
Hmbown Jun 22, 2026
6e71bad
fix(tui): suppress idle timeout countdown in provider-wait footer (#3…
idling11 Jun 22, 2026
bda40b1
test(tui): update stall_reason test for #3189 idle threshold
idling11 Jun 22, 2026
57148e2
fix(tui): keep provider wait timeout warnings visible
Hmbown Jun 22, 2026
9d2950f
fix: maintain conversation history across ACP session/prompt turns
Jun 22, 2026
37c5d19
refactor: combine two session map lookups into one per review feedback
Jun 22, 2026
9b8a7df
feat(tui): add dev server readiness tool
cyq1017 Jun 22, 2026
d254d2a
fix(tui): decouple readiness probe timeout
cyq1017 Jun 22, 2026
5457176
fix(ci): harden nightly and auto-tag retries
Hmbown Jun 22, 2026
1f8408b
fix(release): harden branch hygiene checks
nightt5879 Jun 20, 2026
a368a63
fix(release): qualify branch hygiene remote main
nightt5879 Jun 20, 2026
21f30b1
fix(tui): lower sidebar visibility threshold
donglovejava Jun 22, 2026
3315cce
fix(security): redact headless session diagnostics
Hmbown Jun 22, 2026
c9e887e
fix(cli): kill delegated server child on uncatchable dispatcher death…
wuisabel-gif Jun 22, 2026
ad1c5a6
feat(tui): apply file ask rules at runtime
greyfreedom Jun 22, 2026
3bf50fa
feat(integrations): add WeCom (企业微信) intelligent robot bridge
pkeging Jun 22, 2026
faddf48
fix(integrations): harden WeCom bridge state handling
Hmbown Jun 22, 2026
9b78097
fix(ci): align security contact version guard
Hmbown Jun 22, 2026
36a56b5
fix(cli): use Tokio raw handle for Windows job guard
Hmbown Jun 22, 2026
4d8f697
chore(deps): bump windows from 0.60.0 to 0.62.2
dependabot[bot] Jun 21, 2026
9db2b7a
chore(deps): bump toml from 0.9.11+spec-1.1.0 to 1.0.6+spec-1.1.0
dependabot[bot] Jun 21, 2026
169b649
chore(deps): bump tokio from 1.50.0 to 1.52.3
dependabot[bot] Jun 21, 2026
ae5a7f9
chore(deps): align CLI Windows dependency
Hmbown Jun 22, 2026
7bfc870
feat(models): add pro and flash shortcuts
Hmbown Jun 22, 2026
03d9810
chore(deps): bump lru from 0.16.4 to 0.18.0
dependabot[bot] Jun 19, 2026
4257832
chore(deps): bump similar from 2.7.0 to 3.1.1
dependabot[bot] Jun 19, 2026
c7cc1ec
test(tui): align sidebar width gate boundary
Hmbown Jun 22, 2026
77c4e54
release: prepare v0.8.64 candidate
Hmbown Jun 22, 2026
21aa236
fix(security): correct vulnerability report contact to security@codew…
Hmbown Jun 22, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ crates/tui/src/prompts/*.md text eol=lf
# Rustfmt writes LF; keep Rust sources stable across Windows/Linux/macOS.
*.rs text eol=lf

# Branch hygiene release scripts are invoked directly by bash on Windows
# checkouts; CRLF turns `set -euo pipefail` into an invalid option.
scripts/release/branch-hygiene*.sh text eol=lf

# Keep repository attributes themselves stable on every platform.
.gitattributes text eol=lf

Expand Down
4 changes: 4 additions & 0 deletions .github/AUTHOR_MAP
Original file line number Diff line number Diff line change
Expand Up @@ -110,3 +110,7 @@ greyfreedom@163.com = greyfreedom <11493871+greyfreedom@users.noreply.github.com
puneetdixit200 = puneetdixit200 <236133619+puneetdixit200@users.noreply.github.com>
yekern = Stime <13691766+yekern@users.noreply.github.com>
Stime = Stime <13691766+yekern@users.noreply.github.com>
pkeging = pkeging <237035657+pkeging@users.noreply.github.com>
147567034@qq.com = pkeging <237035657+pkeging@users.noreply.github.com>
KUK4 = KUK4 <246008043+KUK4@users.noreply.github.com>
LLL@users.noreply.github.com = KUK4 <246008043+KUK4@users.noreply.github.com>
52 changes: 47 additions & 5 deletions .github/workflows/auto-tag.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,10 @@ on:
permissions:
contents: write

concurrency:
group: auto-tag-${{ github.ref_name }}
cancel-in-progress: false

jobs:
tag:
runs-on: ubuntu-latest
Expand All @@ -43,6 +47,10 @@ jobs:
echo "::error::Could not parse workspace version from Cargo.toml" >&2
exit 1
fi
if ! echo "$v" | grep -qE '^[0-9]+\.[0-9]+\.[0-9]+$'; then
echo "::error::Workspace version '$v' is not valid semver (expected X.Y.Z)" >&2
exit 1
fi
echo "version=$v" >> "$GITHUB_OUTPUT"
echo "tag=v$v" >> "$GITHUB_OUTPUT"
echo "Workspace version: $v"
Expand All @@ -64,22 +72,56 @@ jobs:

- name: Verify version consistency
if: steps.check.outputs.exists == 'false'
run: ./scripts/release/check-versions.sh
run: |
./scripts/release/check-versions.sh || {
echo "::error::Version consistency check failed. Aborting tag creation." >&2
exit 1
}

- name: Create and push tag
id: create
if: steps.check.outputs.exists == 'false'
env:
TAG: ${{ steps.ver.outputs.tag }}
run: |
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git fetch --tags --quiet
if git rev-parse -q --verify "refs/tags/${TAG}" >/dev/null \
|| git ls-remote --tags origin "refs/tags/${TAG}" | grep -q .; then
echo "pushed=false" >> "$GITHUB_OUTPUT"
echo "Tag ${TAG} already exists after refresh; nothing to do."
exit 0
fi
git tag "${TAG}"
git push origin "${TAG}"
echo "Pushed ${TAG}. release.yml should now run (requires RELEASE_TAG_PAT for trigger)."
max_retries=3
retry_count=0
while [ "${retry_count}" -lt "${max_retries}" ]; do
if git push origin "${TAG}"; then
echo "pushed=true" >> "$GITHUB_OUTPUT"
echo "Pushed ${TAG}. release.yml should now run (requires RELEASE_TAG_PAT for trigger)."
exit 0
fi
if git ls-remote --tags origin "refs/tags/${TAG}" | grep -q .; then
echo "pushed=false" >> "$GITHUB_OUTPUT"
echo "Tag ${TAG} appeared during push; treating as already handled."
exit 0
fi
retry_count=$((retry_count + 1))
if [ "${retry_count}" -lt "${max_retries}" ]; then
echo "Push attempt ${retry_count} failed; retrying in 10s..."
sleep 10
fi
done

echo "::error::Failed to push tag ${TAG} after ${max_retries} attempts." >&2
exit 1

- name: Warn if PAT missing
if: steps.check.outputs.exists == 'false' && env.HAS_PAT != 'true'
if: steps.create.outputs.pushed == 'true'
env:
HAS_PAT: ${{ secrets.RELEASE_TAG_PAT != '' }}
run: |
echo "::warning::RELEASE_TAG_PAT secret is not set. The tag was pushed using GITHUB_TOKEN, which does NOT trigger release.yml. Manually re-push the tag from a developer machine, or run 'gh workflow run release.yml --ref ${{ steps.ver.outputs.tag }}'."
if [ "${HAS_PAT}" != "true" ]; then
echo "::warning::RELEASE_TAG_PAT secret is not set. The tag was pushed using GITHUB_TOKEN, which does NOT trigger release.yml. Manually re-push the tag from a developer machine, or run 'gh workflow run release.yml --ref ${{ steps.ver.outputs.tag }}'."
fi
18 changes: 16 additions & 2 deletions .github/workflows/nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -77,9 +77,12 @@ jobs:
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v7
- uses: dtolnay/rust-toolchain@stable
- uses: dtolnay/rust-toolchain@master
with:
toolchain: '1.88'
targets: ${{ matrix.target }}
- name: Install Rust target
run: rustup target add --toolchain 1.88 ${{ matrix.target }}
- uses: Swatinem/rust-cache@v2
with:
cache-bin: false
Expand Down Expand Up @@ -119,7 +122,18 @@ jobs:
CARGO_TARGET_RISCV64GC_UNKNOWN_LINUX_GNU_LINKER: riscv64-linux-gnu-gcc
PKG_CONFIG_ALLOW_CROSS: 1
PKG_CONFIG_LIBDIR_riscv64gc_unknown_linux_gnu: /usr/lib/riscv64-linux-gnu/pkgconfig
run: cargo build --release --locked --target ${{ matrix.target }}
run: |
for attempt in 1 2 3; do
if cargo build --release --locked --target ${{ matrix.target }}; then
exit 0
fi
if [ "${attempt}" -lt 3 ]; then
echo "Build attempt ${attempt} failed; retrying in 30s..."
sleep 30
fi
done
echo "Build failed after 3 attempts" >&2
exit 1
- name: Stage artifact
id: stage
shell: bash
Expand Down
6 changes: 4 additions & 2 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ jobs:
toolchain: '1.88'
components: clippy, rustfmt
- name: Install Linux system dependencies
if: runner.os == 'Linux' && matrix.target != 'x86_64-unknown-linux-musl'
if: runner.os == 'Linux'
run: |
for i in 1 2 3 4 5; do
sudo apt-get update && break
Expand Down Expand Up @@ -173,6 +173,8 @@ jobs:
with:
toolchain: '1.88'
targets: ${{ matrix.target }}
- name: Install Rust target
run: rustup target add --toolchain 1.88 ${{ matrix.target }}
- uses: Swatinem/rust-cache@v2
with:
cache-bin: false
Expand All @@ -191,7 +193,7 @@ jobs:
run: |
sudo apt-get update
sudo apt-get install -y musl-tools
rustup target add x86_64-unknown-linux-musl
rustup target add --toolchain 1.88 x86_64-unknown-linux-musl
cargo build --release --locked --target x86_64-unknown-linux-musl
- name: Install RISC-V cross-compilation toolchain
if: matrix.target == 'riscv64gc-unknown-linux-gnu'
Expand Down
17 changes: 2 additions & 15 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -104,12 +104,6 @@ apps/
# Maintainer-internal design notes (trade-secret material, never published)
.private/

# Maintainer-local SWE-bench scratch (instance workspaces, venvs, predictions,
# Docker harness logs). Never published.
.swebench/
deep-swe/
all_preds.jsonl

# Agent handoffs and version-specific setup plans are working-state notes, not
# public docs. Keep durable setup guidance in docs/runbooks instead.
docs/*HANDOFF*.md
Expand All @@ -123,21 +117,14 @@ docs/*_PLAN.md
scripts/run_deep_swe.py
.claude/

# Benchmark artifacts and caches re-included by !scripts/**
# Local run artifacts and caches re-included by !scripts/**
results/
benchmark_results/*
!benchmark_results/.gitkeep
scripts/**/__pycache__/

# Maintainer-local verification artifacts and benchmark corpora
.harbor-datasets/
.pinchbench-skill/
.terminal-bench-datasets/
.venv-bench/
# Maintainer-local verification artifacts
.uv-bin/
.uv-cache/
.uv-tools/
codewhale__*.json
issues/
logs/
notes/
76 changes: 69 additions & 7 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,70 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [0.8.64] - 2026-06-22

### Added

- **Seamless auto-compaction defaults.** Known large-context routes now keep
automatic compaction on by default while carrying summaries forward through
the stable prompt path, reducing surprise context loss without changing
explicit opt-out behavior.
- **Runtime web automation readiness.** Local app automation gains a
loopback-only dev-server readiness primitive so agents can wait for TCP and
optional HTTP health checks before browser verification. Harvested from
#3376 by @cyq1017.
- **Model and integration polish.** `/model pro` and `/model flash` shortcuts
now resolve to the current DeepSeek V4 routes while preserving existing model
IDs. Harvested from #3350 by @KUK4. The WeCom bridge landed with
maintainer follow-up hardening for state permissions and chat-facing error
reporting, from #3370 by @pkeging.

### Fixed

- **Security and trust-boundary hardening.** Project-local config can no longer
loosen user-owned shell or instruction-file policy, file edits now require a
fresh read of the target file, git history inputs reject option-shaped or
control-character revisions, interactive execution surfaces require approval,
and local tool paths are narrowed through workspace/root validation.
- **Runtime and diagnostics redaction.** Generated runtime/app-server tokens,
raw session lineage identifiers, provider registry drift values, review
receipt internals, and webhook URLs are no longer echoed into human-facing
logs or diagnostics.
- **Network and alert safety.** Provider TLS verification bypass requests now
fail closed, fleet alert webhooks require HTTPS, fetch URL hostnames are
resolved before requests, and runtime mobile auth no longer relies on
token-bearing URLs.
- **Path-state hardening.** Config sibling files, project MCP cwd values,
runtime thread store files, sub-agent state, project-local state roots, and
app-server sidecar config paths now resolve through checked roots before
reads/writes.
- **Release CI repair.** Nightly cross-target builds install Rust targets
explicitly and retry transient cargo failures; auto-tag runs are serialized
and treat an already-created remote tag as a no-op. Safe slices harvested
from #3374 by @donglovejava.
- **Provider wait and sidebar regressions.** Provider-wait footers suppress
noisy countdowns until useful while keeping timeout warnings visible,
harvested from #3375 by @idling11. The pinned sidebar can render at a
narrower 64-column boundary, harvested from #3371 by @donglovejava.
- **Delegated server cleanup.** Delegated `serve` / `app-server` children gain
OS-level parent-death cleanup on supported platforms, completing the #3259
follow-up from #3378 and #3317 by @wuisabel-gif.
- **ACP and sandbox correctness.** ACP sessions preserve multi-turn
conversation history across prompt turns, harvested from #3372 by @xulongzhe.
Worktree Git metadata writes are allowed through sandbox policy without
broad trust-mode escalation, from #3356 by @cyq1017 and the #3355 report by
@linletian.

### Changed

- **Community and dependency harvests.** The release train carries focused
community-credit slices from #3379 by @greyfreedom, #3348 by @nightt5879,
#3346 by @hongqitai, #3345/#3333 by @cyq1017, and Dependabot updates for
`windows`, `toml`, `tokio`, `lru`, `similar`, and web tooling security locks.
- **Public release surface cleanup.** Benchmark-specific materials were kept
out of the public release repo; benchmark source fragments belong in the
separate `codewhale-bench` lane.

## [0.8.63] - 2026-06-19

### Added
Expand Down Expand Up @@ -55,7 +119,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
while Ctrl-X is scoped to Tasks-sidebar background shell cancellation. Shell
jobs launched by sub-agents now render with their child-agent owner in the
Tasks sidebar and transcript.
- **Benchmark-turn recovery and context economy.** Repeated read-only search
- **Long-turn recovery and context economy.** Repeated read-only search
loop blocks now return guidance instead of fatal tool failures, Python build
failures that are missing `setuptools` include an install/retry hint, long
foreground shell timeouts steer models toward background execution, and noisy
Expand Down Expand Up @@ -123,7 +187,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
unchanged.
- **Base prompt / delegate skill guidance** updated to encourage parallel
read-only exploration (2-4 `type: "explore"` sub-agents) for broad repo,
version, branch, benchmark, and API-surface investigations, while keeping
version, branch, release, and API-surface investigations, while keeping
architecture, integration, and final verification in the parent. The
delegate skill examples now use provider-neutral `model_strength` instead of
hardcoded DeepSeek model ids.
Expand Down Expand Up @@ -297,7 +361,7 @@ folds in several community contributions.
- Work sidebar no longer shows stale `phase now:` / `phase next:` strategy rows once the checklist
is 100% complete.
- Plan mode no longer shortcuts investigation for requests that name a repository, URL, version,
release, build state, benchmark, bug, PR, issue, API surface, or local code path.
release, build state, bug, PR, issue, API surface, or local code path.
- Oversized pasted text stays editable in the composer, with a file backup appended at submit
time for model access; thanks @idling11 (#3267, closes #3263).
- Bare digit keys `1`-`8` now insert text instead of firing hotbar slots; use `Alt+digit` for
Expand Down Expand Up @@ -796,8 +860,6 @@ folds in several community contributions.

### Added

- **Benchmark harness runners.** Added CodeWhale-native benchmark entry points for SWE-bench, Terminal-Bench, and PinchBench, plus a local PinchBench runner that can grade tool-use traces with an LLM judge.
- **Direct MiMo benchmark routing.** The benchmark runner now defaults to direct Xiaomi MiMo v2.5 Pro routing when configured, while keeping provider/model selection explicit.
- Added `/restore list [N]` so users can inspect more side-git rollback
snapshots with UTC timestamps before choosing a restore point. Plain
`/restore` now shows the 20 most recent snapshots, numeric restore targets can
Expand Down Expand Up @@ -1138,7 +1200,6 @@ folds in several community contributions.

### Fixed

- **Benchmark workspace copying.** Fixed benchmark workspace file copying so local benchmark tasks can preserve their intended file layout during agent runs.
- **MiMo default tests.** Guarded Xiaomi MiMo default-model tests against ambient CI provider environment variables.
- Stream/body decode failures such as `Stream read error: error decoding
response body` are now classified as recoverable network interruptions
Expand Down Expand Up @@ -2284,7 +2345,8 @@ overflow report and `/theme` picker edge-wrapping patch in #1814.

Older releases (v0.8.39 and earlier) are archived in [docs/CHANGELOG_ARCHIVE.md](docs/CHANGELOG_ARCHIVE.md).

[Unreleased]: https://github.com/Hmbown/CodeWhale/compare/v0.8.63...HEAD
[Unreleased]: https://github.com/Hmbown/CodeWhale/compare/v0.8.64...HEAD
[0.8.64]: https://github.com/Hmbown/CodeWhale/compare/v0.8.63...v0.8.64
[0.8.63]: https://github.com/Hmbown/CodeWhale/compare/v0.8.62...v0.8.63
[0.8.62]: https://github.com/Hmbown/CodeWhale/compare/v0.8.61...v0.8.62
[0.8.61]: https://github.com/Hmbown/CodeWhale/compare/v0.8.60...v0.8.61
Expand Down
Loading