feat: unified AiGateway — service + client + security + autonomy + feedback + quota#43
Merged
Conversation
Three new endpoints replacing the old single-purpose codex_audit_service: - POST /v1/ai/analyze — sync LLM via Claude/GPT API (LlmAdapter) - POST /v1/ai/execute/jobs — async codex exec (CodexAdapter) + polling - POST /v1/ai/review — multi-model parallel review + consensus Backward-compatible aliases for /v1/codex-audit* maintained. New service architecture: service/ ├── ai_gateway_service.py — HTTP server with 3 endpoints ├── adapters/ │ ├── llm_adapter.py — Claude/GPT API (OpenAI + Anthropic) │ └── codex_adapter.py — codex exec subprocess ├── auth/ │ └── github_oidc.py — extracted OIDC/JWT verification └── contracts.py — request/response schemas New client library: client/ ├── gateway_client.py — AiGatewayClient (analyze/execute/review) ├── config.py — GatewayConfig (replaces 3 config systems) └── errors.py — unified error hierarchy Fixes: LlmAdapter now actually exists on the service side (was missing). analyze tasks no longer waste Codex agent resources. Co-Authored-By: Claude <noreply@anthropic.com>
1. Rate limiting: sliding-window 30 req/60s for analyze & review endpoints 2. FAKE_OUTPUT backdoor: gated behind CODEX_AUDIT_SERVICE_ENV=production 3. source_repository allowlist: validate against CODEX_AUDIT_SERVICE_ALLOWED_SOURCE_REPOSITORIES 4. Cross-org prevention: OIDC claims.repository org must match source_repository org 5. Structured audit logging: JSON audit events (request, job_submitted, analyze_completed, etc.) 6. Static token complexity: CODEX_AUDIT_SERVICE_TOKEN must be ≥32 chars at startup 7. Sandbox allowlist: CODEX_AUDIT_SERVICE_ALLOWED_SANDBOXES gates caller-requested sandbox 8. Active job cap: CODEX_AUDIT_SERVICE_MAX_ACTIVE_JOBS prevents job pileup (default 10) Co-Authored-By: Claude <noreply@anthropic.com>
Autonomy decision engine that combines AI confidence scores with file
risk tiers to recommend autonomous actions.
New module:
service/autonomy.py — AutonomyConfig, classify_file_risk(),
classify_changes_risk(), decide_action(), recommended_action()
Confidence × Risk decision matrix:
<0.60 0.60-0.79 0.80-0.94 ≥0.95
low auto_pr auto_merge auto_merge auto_merge
medium escalate auto_pr auto_merge auto_merge
high escalate escalate auto_pr auto_merge
critical escalate escalate escalate escalate
Service changes:
- _handle_review() accepts changed_paths for risk classification
- Response now includes recommended_action with confidence/risk/reason
- _extract_confidence_from_output() parses confidence from LLM JSON
Client changes:
- ReviewResult includes recommended_action dict
- review() accepts changed_paths parameter
Co-Authored-By: Claude <noreply@anthropic.com>
Change registry, effect tracking, shadow escalation, effectiveness reports. New endpoints: /v1/ai/feedback/*, /v1/ai/changes/* Co-Authored-By: Claude <noreply@anthropic.com>
Quota Manager: - Per-repo daily/weekly budget tracking (configurable via JSON) - Model cost estimation (token-based for LLMs, flat for codex) - Budget exhaustion → 429 with recommended cheaper model - Model tier escalation: gpt-mini → claude → fable-5 → codex Health Monitor: - Per-endpoint error rates, latency percentiles (p50/p95/p99) - Sliding 5-min windows for degradation detection - Three states: healthy / degraded / unhealthy - Auto-degraded at >10% errors or p95 >30s - Enhanced /healthz now returns status + uptime New endpoints: GET /v1/ai/health — full health snapshot GET /v1/ai/quota?repo= — quota status Client methods: get_quota(repo), get_health() Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Codex PR ReviewPlease ensure a human reviewer checks this PR before merging. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Unified AI Gateway Architecture
Changes
Phases
🤖 Generated with Claude Code