Skip to content

01#3256

Open
Guajir0-code wants to merge 6 commits into
ultraworkers:mainfrom
Guajir0-code:main
Open

01#3256
Guajir0-code wants to merge 6 commits into
ultraworkers:mainfrom
Guajir0-code:main

Conversation

@Guajir0-code

Copy link
Copy Markdown

Summary

  • TBD

Anti-slop triage

  • Classification:
  • Evidence:
  • Non-destructive review result:

Verification

  • Targeted tests/docs checks ran, or the gap is explicitly recorded.
  • git diff --check passes.
  • No live secrets, tokens, private logs, or unrelated generated churn are included.

Resolution gate

  • If this PR resolves an issue, the issue number and fix evidence are linked.
  • If this PR should not merge, the rejection/defer rationale is evidence-backed and does not rely on vibes.
  • I did not merge/close remote PRs or issues from an automation lane without owner approval.

Guajir0-code and others added 6 commits April 20, 2026 11:00
After the model edits a .rs file via edit_file/write_file, the runtime
now automatically runs cargo check, clippy, fmt --check, and test on the
owning crate and folds the result back into the tool_result. If any step
fails, is_error=true forces the model to correct on the next iteration
instead of waiting for the user to notice.

- New verifier module with Verifier trait and CargoVerifier impl
  (manifest discovery, subprocess timeout, output truncation preserving
  error/warning lines, early-exit after first failure).
- RuntimeVerifierConfig wired through settings.json with nested schema
  validation, precedence User/Project/Local.
- ConversationRuntime integrates the verifier between post-hook and the
  tool_result, with record_verifier_ran telemetry.
- CLI wires CargoVerifier from config.
- 12 e2e tests spawn real cargo against temp crates to cover passing
  code, type errors, fmt violations, timeouts, step skipping after
  failure, nested files, alternate path keys, and malformed input.

Also clears pre-existing clippy/compile errors in unrelated crates
(ApiError missing suggested_action in 4 CLI tests, map_unwrap_or,
duration_suboptimal_units, trailing commas, result_large_err) so the
workspace passes cargo clippy --workspace --all-targets -D warnings
and cargo test --workspace end-to-end.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bring in the staged verifier rework from upstream (Rust / Node-TS /
Python adapters, quick+final phases, structured VerificationReport,
final-gate loop), the new Verification message role, getrandom-based
OAuth PKCE generation, and the Windows-compatible hook/MCP test
infrastructure.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- verifier: emit Unavailable reports on Node package.json IO/parse errors
  instead of silent None (Bug 2)
- verifier: drop dead final_phase param from verify_node/verify_python
- verifier: scope CARGO_TERM_COLOR=never to cargo invocations only
- verifier: remove dead VerificationReport::target()
- conversation: make_final_gate_reminder report_id now includes adapter_id
  and mutation_sequence (Bug 3 — prevents collisions across adapters)
- conversation: when run_final_verification returns None, emit synthetic
  Unavailable report and advance ledger instead of silent continue (Bug 1)
- conversation: cap final-gate attempts at MAX_FINAL_GATE_ATTEMPTS=5 per
  (adapter, root); emit aborted Unavailable report on overflow (Bug 4)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Previous fix keyed the counter only by (adapter, project_root), so when
the model edited code mid-turn and mutation_sequence advanced, the prior
attempts carried over and prematurely tripped the cap on otherwise-valid
work. Key the counter by (adapter, root, mutation_sequence) so each
snapshot gets its own budget.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Merges complete edit→verify→fix pipeline into the repo:

Trunk (phases 1-5):
- Rich StepDiagnostics across Rust/Node/Python adapters
- Change-scoped verification via nearest manifest walk
- VerificationReport content block with shadow/text/typed report modes
- RuntimeVerifierMode::Auto + CLAUDE_CODE_VERIFIER_AUTO
- Parallel bash validation wiring (permission_enforcer + tools)
- verifier_ran telemetry with adapter/phase/failure_kind/mutation_sequence

Post-trunk modules:
- runtime::critic — CriticPlanner with subagent_depth guard, diff
  thresholds (>=4 files OR >=200 lines OR >1 root), per-mutation
  dedup; wired into LiveCli post-turn pipeline
- runtime::rollout_metrics — aggregate(), evaluate_budget_gates(),
  samples_from_traces() with 1pp/5%/10%/15% regression limits
- promote_auto_skill + run_promote_auto_skill_cli — 3-fixture replay
  + human approval + 10% token budget gate for auto-generated skills
- Explicit marker-based adapter detector for auto-mode verification
- Dedicated bash permission parity tests

Telemetry: turn_completed now emits turn_latency_ms + tokens_total.

Validation: cargo fmt, clippy -D warnings, cargo test --workspace all
green (+27 new tests across runtime, tools, cli).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Latest main already contains the functional BrokenPipe tolerance in
plugins::hooks::CommandWithStdin::output_with_stdin, but the only
coverage for the original CI failure was the higher-level plugin hook
test. Add a deterministic regression that exercises the exact low-level
EPIPE path by spawning a hook child that closes stdin immediately while
the parent writes an oversized payload.

This keeps the real root cause explicit: Linux surfaced BrokenPipe from
the parent's stdin write after the hook child closed fd 0 early. Missing
execute bits were not the primary bug.

Constraint: Keep the change surgical on top of latest main
Rejected: Re-open the production code path | latest main already contains the runtime fix
Rejected: Inflate HookRunner payloads in the regression | HOOK_* env injection hit ARG_MAX before the pipe path
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep BrokenPipe coverage near CommandWithStdin so future refactors do not regress the Linux EPIPE path
Tested: cargo test -p plugins hooks::tests::collects_and_runs_hooks_from_enabled_plugins -- --exact (10x)
Tested: cargo test -p plugins hooks::tests::output_with_stdin_tolerates_broken_pipe_when_child_closes_stdin_early -- --exact (10x)
Tested: cargo test --workspace
Not-tested: GitHub Actions rerun on the PR branch
@DereC4

DereC4 commented Jun 23, 2026

Copy link
Copy Markdown

This isn't even a PR what 😭 you just used ai to generate and refactor a bunch of files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants