Skip to content

feat: unified AiGateway — service + client + security + autonomy + feedback + quota#43

Merged
Pigbibi merged 6 commits into
mainfrom
feat/unified-ai-gateway
Jun 29, 2026
Merged

feat: unified AiGateway — service + client + security + autonomy + feedback + quota#43
Pigbibi merged 6 commits into
mainfrom
feat/unified-ai-gateway

Conversation

@Pigbibi

@Pigbibi Pigbibi commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Unified AI Gateway Architecture

Changes

  • service/: 10 endpoints with LlmAdapter + CodexAdapter + security hardening + autonomy engine + feedback loop + quota management
  • client/: pip-installable AiGatewayClient library
  • .github/workflows/: merged monthly-orchestrator.yml from archived AuditOrchestrator

Phases

  1. Base: 3 adapter interfaces (analyze/execute/review), LlmAdapter + CodexAdapter, OIDC extraction
  2. Security: rate limiting, sandbox allowlist, source_repo validation, org correlation, audit logging, token complexity, FAKE_OUTPUT hardening
  3. Autonomy: confidence×risk decision matrix, graduated actions (auto_merge/auto_pr/escalate)
  4. Feedback: change registry, before/after effect tracking, shadow audit escalation
  5. Quota+Health: per-repo budgets, cost estimation, health monitoring with degradation detection

🤖 Generated with Claude Code

Pigbibi and others added 6 commits June 29, 2026 16:29
Three new endpoints replacing the old single-purpose codex_audit_service:

- POST /v1/ai/analyze  — sync LLM via Claude/GPT API (LlmAdapter)
- POST /v1/ai/execute/jobs — async codex exec (CodexAdapter) + polling
- POST /v1/ai/review — multi-model parallel review + consensus

Backward-compatible aliases for /v1/codex-audit* maintained.

New service architecture:
  service/
  ├── ai_gateway_service.py  — HTTP server with 3 endpoints
  ├── adapters/
  │   ├── llm_adapter.py     — Claude/GPT API (OpenAI + Anthropic)
  │   └── codex_adapter.py   — codex exec subprocess
  ├── auth/
  │   └── github_oidc.py     — extracted OIDC/JWT verification
  └── contracts.py           — request/response schemas

New client library:
  client/
  ├── gateway_client.py      — AiGatewayClient (analyze/execute/review)
  ├── config.py              — GatewayConfig (replaces 3 config systems)
  └── errors.py              — unified error hierarchy

Fixes: LlmAdapter now actually exists on the service side (was missing).
analyze tasks no longer waste Codex agent resources.

Co-Authored-By: Claude <noreply@anthropic.com>
1. Rate limiting: sliding-window 30 req/60s for analyze & review endpoints
2. FAKE_OUTPUT backdoor: gated behind CODEX_AUDIT_SERVICE_ENV=production
3. source_repository allowlist: validate against CODEX_AUDIT_SERVICE_ALLOWED_SOURCE_REPOSITORIES
4. Cross-org prevention: OIDC claims.repository org must match source_repository org
5. Structured audit logging: JSON audit events (request, job_submitted, analyze_completed, etc.)
6. Static token complexity: CODEX_AUDIT_SERVICE_TOKEN must be ≥32 chars at startup
7. Sandbox allowlist: CODEX_AUDIT_SERVICE_ALLOWED_SANDBOXES gates caller-requested sandbox
8. Active job cap: CODEX_AUDIT_SERVICE_MAX_ACTIVE_JOBS prevents job pileup (default 10)

Co-Authored-By: Claude <noreply@anthropic.com>
Autonomy decision engine that combines AI confidence scores with file
risk tiers to recommend autonomous actions.

New module:
  service/autonomy.py — AutonomyConfig, classify_file_risk(),
  classify_changes_risk(), decide_action(), recommended_action()

Confidence × Risk decision matrix:
            <0.60       0.60-0.79    0.80-0.94    ≥0.95
  low       auto_pr     auto_merge   auto_merge   auto_merge
  medium    escalate    auto_pr      auto_merge   auto_merge
  high      escalate    escalate     auto_pr      auto_merge
  critical  escalate    escalate     escalate     escalate

Service changes:
  - _handle_review() accepts changed_paths for risk classification
  - Response now includes recommended_action with confidence/risk/reason
  - _extract_confidence_from_output() parses confidence from LLM JSON

Client changes:
  - ReviewResult includes recommended_action dict
  - review() accepts changed_paths parameter

Co-Authored-By: Claude <noreply@anthropic.com>
Change registry, effect tracking, shadow escalation, effectiveness reports.
New endpoints: /v1/ai/feedback/*, /v1/ai/changes/*

Co-Authored-By: Claude <noreply@anthropic.com>
Quota Manager:
  - Per-repo daily/weekly budget tracking (configurable via JSON)
  - Model cost estimation (token-based for LLMs, flat for codex)
  - Budget exhaustion → 429 with recommended cheaper model
  - Model tier escalation: gpt-mini → claude → fable-5 → codex

Health Monitor:
  - Per-endpoint error rates, latency percentiles (p50/p95/p99)
  - Sliding 5-min windows for degradation detection
  - Three states: healthy / degraded / unhealthy
  - Auto-degraded at >10% errors or p95 >30s
  - Enhanced /healthz now returns status + uptime

New endpoints:
  GET /v1/ai/health        — full health snapshot
  GET /v1/ai/quota?repo=    — quota status

Client methods:
  get_quota(repo), get_health()

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

🤖 Codex PR Review

⚠️ Review skipped: The Codex review could not be completed.

Codex service request failed: 401 {"status": "error", "error": "OIDC workflow_ref is not allowed"}

Please ensure a human reviewer checks this PR before merging.

@Pigbibi Pigbibi merged commit e0ff394 into main Jun 29, 2026
1 of 3 checks passed
@Pigbibi Pigbibi deleted the feat/unified-ai-gateway branch June 29, 2026 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant