From 7d72efc27a949746c7186ad845d1dc3f14e68262 Mon Sep 17 00:00:00 2001 From: John Myers Date: Thu, 21 May 2026 16:10:58 -0700 Subject: [PATCH 1/3] docs(rfc): propose sandbox proxy egress adapter model Signed-off-by: John Myers --- .../README.md | 420 ++++++++++++++++++ .../current-shape.md | 167 +++++++ .../implementation-plan.md | 127 ++++++ .../technical-design.md | 259 +++++++++++ 4 files changed, 973 insertions(+) create mode 100644 rfc/0004-sandbox-proxy-egress-adapter/README.md create mode 100644 rfc/0004-sandbox-proxy-egress-adapter/current-shape.md create mode 100644 rfc/0004-sandbox-proxy-egress-adapter/implementation-plan.md create mode 100644 rfc/0004-sandbox-proxy-egress-adapter/technical-design.md diff --git a/rfc/0004-sandbox-proxy-egress-adapter/README.md b/rfc/0004-sandbox-proxy-egress-adapter/README.md new file mode 100644 index 000000000..01001ca15 --- /dev/null +++ b/rfc/0004-sandbox-proxy-egress-adapter/README.md @@ -0,0 +1,420 @@ +--- +authors: + - "@johntmyers" +state: draft +links: + - https://github.com/NVIDIA/OpenShell/issues/1107 + - https://github.com/NVIDIA/OpenShell/pull/1083 + - https://github.com/NVIDIA/OpenShell/pull/1151 +--- + +# RFC 0004 - Sandbox Proxy Egress Adapter Model + + + +## Summary + +Refactor sandbox egress around one shared authorization and relay pipeline. +CONNECT, forward HTTP proxy, transparent native TCP, policy DNS, +`inference.local`, and `policy.local` should become adapters that translate +userland entry points into a common egress intent. Policy evaluation, +destination validation, credential injection, request-body rewrite, +WebSocket upgrade handling, protocol parsing, and relay ownership should happen +behind shared boundaries. + +This RFC keeps the main direction in this document. Supporting detail lives in: + +- [Current shape appendix](current-shape.md) +- [Technical design appendix](technical-design.md) +- [Implementation plan](implementation-plan.md) + +## Motivation + +The sandbox proxy has accumulated separate egress paths for CONNECT, forward +HTTP, local services, inference routing, endpoint metadata, credential +injection, and L7 policy. That makes security changes easy to apply to one path +and miss in another. + +The target shape separates three concerns: + +- **Adapters** describe how userland reached the proxy. +- **Authorization** decides whether that egress is allowed and what endpoint + behavior applies. +- **Relays** own bytes, credentials, protocol parsing, and upstream dialing. + +## Non-goals + +- Replace CONNECT with forward proxy as the only explicit proxy mode. +- Add SOCKS support. +- Add HTTP/2 L7 parsing in this refactor. +- Redesign provider credential storage. +- Reintroduce iptables as the sandbox packet filtering backend. +- Use eBPF connect hooks for transparent capture. Native TCP capture needs a + userland proxy in the byte stream for TLS termination and protocol parsing. + +## Proposal + +### Migration Big Rocks + +1. **Transport adapters.** CONNECT, forward HTTP, transparent TCP, policy DNS, + and local service routes become small entry adapters. They parse their + surface and produce either an egress intent, a local response, or a DNS + answer. They do not duplicate policy evaluation. +2. **Egress intent and decision.** The shared authorization boundary evaluates + L4 policy once per connection intent and returns one decision containing the + matched policy, matched endpoint, process identity, allowed IP metadata, TLS + behavior, and protocol enforcement. +3. **Relays.** Relays receive an authorized destination connector, not an + already-open upstream socket. HTTP relays evaluate every request before + dialing, own REST request-body credential rewrite, and hand allowed + WebSocket upgrades to the WebSocket relay. TCP application parsers own their + protocol loop and decide when a validated upstream connection is needed. + +### Unified Adapter Flow + +```mermaid +flowchart TD + User["Userland payload / harness"] + + subgraph ExplicitProxy["Explicit proxy listener"] + ProxyBytes["HTTP proxy bytes"] + IsConnect{"CONNECT request?"} + Connect["CONNECT adapter"] + Forward["Forward HTTP adapter"] + ProxyBytes --> IsConnect + IsConnect -- Yes --> Connect + IsConnect -- No --> Forward + end + + subgraph NativeTcp["Policy DNS + native TCP"] + NameLookup["Userland DNS lookup"] + PolicyDns["Policy DNS adapter"] + DnsAnswer["DNS answer"] + NativeConnect["Userland connect(ip:port)"] + TcpAdapter["Transparent TCP adapter"] + NameLookup --> PolicyDns + PolicyDns --> DnsAnswer + DnsAnswer --> NativeConnect + NativeConnect --> TcpAdapter + end + + subgraph LocalApis["Sandbox-local services"] + InferenceReq["Request to inference.local"] + PolicyReq["Request to policy.local"] + InferenceAdapter["Inference local adapter"] + PolicyAdapter["Policy local adapter"] + InferenceReq --> InferenceAdapter + PolicyReq --> PolicyAdapter + end + + subgraph Shared["Shared egress pipeline"] + Intent["Egress intent"] + Auth["Authorize and select endpoint"] + Decision["Egress decision"] + Validate["Resolve and validate destination"] + Relay["Relay"] + Deny["Adapter-specific deny response"] + Intent --> Auth + Auth --> Allowed{"Allowed?"} + Allowed -- No --> Deny + Allowed -- Yes --> Decision + Decision --> Validate + Validate --> Relay + end + + User --> ProxyBytes + User --> NameLookup + User --> NativeConnect + User --> InferenceReq + User --> PolicyReq + + Connect --> Intent + Forward --> Intent + TcpAdapter --> Intent + InferenceAdapter --> InferenceResp["Local inference response"] + PolicyAdapter --> PolicyResp["Local policy response"] +``` + +### Relay Flow + +```mermaid +flowchart TD + Start["Authorized egress + destination connector"] + Start --> HasFirst{"First HTTP request already parsed?"} + + HasFirst -- Yes --> ForwardMode{"Selected enforcement"} + ForwardMode -- "L4 only" --> HttpCred["HTTP relay
credential injection only"] + ForwardMode -- "HTTP rules" --> HttpL7["HTTP relay
REST/GraphQL/WebSocket policy"] + ForwardMode -- "TCP app rules" --> BadForward["Deny: HTTP request for TCP app endpoint"] + + HasFirst -- No --> Inspect["Inspect tunnel or native stream bytes"] + Inspect --> SkipTls{"Endpoint says skip TLS handling?"} + SkipTls -- Yes --> TcpBytes["TCP relay
byte copy"] + SkipTls -- No --> Peek["Peek client bytes"] + Peek --> IsTls{"TLS ClientHello?"} + IsTls -- Yes --> Tls["Shared TLS terminator"] + IsTls -- No --> Readable["Readable client stream"] + Tls --> Readable + + Readable --> Mode{"Selected enforcement"} + Mode -- "L4 only" --> SniffHttp{"Looks like HTTP?"} + SniffHttp -- Yes --> HttpCred + SniffHttp -- No --> TcpBytes + + Mode -- "HTTP rules" --> MustHttp{"Looks like HTTP?"} + MustHttp -- Yes --> HttpL7 + MustHttp -- No --> DenyHttp["Deny: expected HTTP"] + + Mode -- "TCP app rules" --> TcpParser["TCP relay
application parser owns loop"] + + HttpCred --> Creds["Resolve and redact credentials"] + HttpL7 --> CredsL7["Resolve and redact credentials"] + CredsL7 --> ParseHttp["Parse and evaluate each HTTP request"] + ParseHttp --> HttpAllowed{"Request allowed?"} + HttpAllowed -- No --> HttpDeny["Local HTTP deny
no upstream connect"] + HttpAllowed -- Yes --> Rewrite["Rewrite configured credential slots"] + Creds --> Rewrite + Rewrite --> HttpDial["Connect or reuse upstream"] + HttpDial --> HttpResponse["Write request and relay response"] + HttpResponse --> Upgrade{"101 WebSocket upgrade?"} + Upgrade -- No --> NextHttp["Continue HTTP request loop"] + Upgrade -- Yes --> WsMode{"WebSocket inspection needed?"} + WsMode -- No --> RawUpgrade["Raw upgraded stream"] + WsMode -- Yes --> WsRelay["WebSocket relay
text-frame rewrite / message policy"] + NextHttp --> ParseHttp + + TcpParser --> ParserDial["Parser dials upstream when protocol allows"] + TcpBytes --> TcpDial["Connect upstream"] + TcpDial --> ByteCopy["Copy bytes"] +``` + +Relay rules: + +- HTTP credential injection happens in both HTTP modes: L4-only HTTP and + HTTP-inspected. +- Credential injection includes request target, query, headers, opt-in REST + request bodies, and opt-in client-to-server WebSocket text frames. +- HTTP L7 policy is evaluated before upstream dial for each request. +- WebSocket upgrade policy is evaluated as HTTP first. After an allowed `101` + upgrade, the WebSocket relay owns frame parsing when text-frame credential + rewrite, WebSocket transport policy, GraphQL-over-WebSocket policy, or safe + compression handling is configured. Other upgraded streams remain raw. +- Forward HTTP must stay in the shared HTTP relay loop. It must not evaluate + one request and then switch to raw bidirectional copy. Keeping forward HTTP + single-request with `Connection: close` is also acceptable, but the invariant + is that no follow-on request bytes reach upstream unevaluated. +- `protocol: tcp` means L4 authorization plus byte copy unless HTTP is detected + for credential injection. +- Future TCP application parsers, such as Redis or Postgres, own the full + message loop and can parse multiple commands over one TCP session. + +### CONNECT Adapter + +CONNECT remains the standard explicit proxy tunnel for HTTPS and arbitrary TCP. +It parses the CONNECT line into an egress intent, then waits for the shared +relay to decide if and when an upstream connection should be opened. + +```mermaid +flowchart TD + Client["Client sends CONNECT host:port"] --> Parse["Parse target"] + Parse --> Intent["Build egress intent"] + Intent --> Auth["Shared authorization"] + Auth --> Allowed{"Destination allowed?"} + Allowed -- No --> Deny["CONNECT deny response"] + Allowed -- Yes --> Ready["Return tunnel-ready response"] + Ready --> Relay["Relay inspects tunneled bytes"] + Relay --> Dial["Relay or parser connects upstream when allowed"] +``` + +CONNECT should stay because forward proxy is only a plaintext HTTP request +format. CONNECT is still the generic explicit proxy mode for TLS and non-HTTP +TCP clients. + +### Forward HTTP Adapter + +Forward HTTP is compatibility for clients that send absolute-form HTTP requests. +The adapter parses the first request and hands it to the shared HTTP relay or +an equivalent guarded single-request relay. + +```mermaid +flowchart TD + Req["Absolute-form HTTP request"] --> Parse["Parse URI and first request"] + Parse --> Intent["Build egress intent"] + Intent --> Auth["Shared authorization"] + Auth --> Allowed{"Allowed?"} + Allowed -- No --> Deny["HTTP deny response"] + Allowed -- Yes --> Relay["Shared or guarded HTTP relay"] + Relay --> Mode{"Connection mode"} + Mode -- "Persistent" --> Loop["Evaluate every request on this connection"] + Mode -- "Single request" --> Close["Force Connection: close"] +``` + +### Transparent TCP Adapter + +Transparent TCP supports native clients that do not know they are using a +proxy. The capture mechanism should be network namespace interception into a +userland proxy listener. Since main now uses nftables for sandbox bypass +enforcement, transparent capture should be designed as nftables +REDIRECT/TPROXY state in the inner sandbox network namespace, not as an +iptables path. + +```mermaid +flowchart TD + Policy["Policy load / reload"] --> Register["Register native TCP names"] + Lookup["Userland DNS lookup"] --> Dns["Policy DNS adapter"] + Register --> Dns + Dns --> Answer["Return approved IPs"] + Answer --> Capture["Enable capture for active IP:port"] + Connect["Userland connect(ip:port)"] --> Capture + Capture --> Adapter["Transparent TCP adapter"] + Adapter --> Intent["Build egress intent from original destination"] + Intent --> Shared["Shared authorization and relay"] +``` + +### Policy DNS + +Policy DNS replaces static `/etc/hosts` snapshots for native TCP names. It is +query-driven: check whether the name is policy-eligible, resolve through trusted +DNS, filter returned IPs, publish the active endpoint mapping, and answer +userland. + +```mermaid +flowchart TD + Query["DNS query from userland"] --> Adapter["Policy DNS adapter"] + Adapter --> Known{"Registered native TCP policy name?"} + Known -- No --> Refuse["NXDOMAIN / REFUSED / SERVFAIL"] + Known -- Yes --> Upstream["Trusted upstream DNS lookup"] + Upstream --> Filter["Filter answers against endpoint policy"] + Filter --> Publish["Publish active mapping and capture rule"] + Publish --> Answer["DNS answer"] +``` + +The later `connect(ip:port)` still creates the egress intent and runs through +normal authorization. + +### Network Enforcement Substrate + +Current main uses nftables for bypass enforcement. It accepts proxy-bound +traffic and loopback, accepts established flows, then rejects and optionally +logs other TCP/UDP traffic for the bypass monitor. That is enforcement, not +native TCP capture. + +```mermaid +flowchart TD + Conn["Userland packet"] --> ProxyDest{"Proxy destination?"} + ProxyDest -- Yes --> AcceptProxy["nftables accept"] + ProxyDest -- No --> Capture{"Future native TCP capture match?"} + Capture -- Yes --> Redirect["nftables redirect/TPROXY to transparent adapter"] + Capture -- No --> Reject["nftables log + reject bypass"] + Reject --> Monitor["Bypass monitor emits OCSF"] +``` + +The transparent TCP work should extend this nftables model with explicit +capture rules that run before the reject path and are scoped to active policy +DNS mappings. + +### Local Service Adapters + +`inference.local` and `policy.local` are sandbox-local APIs. They should use +the adapter model, but they do not represent normal external egress. + +```mermaid +flowchart TD + A["Request to inference.local"] --> B["Inference local adapter"] + B --> C{"TLS and inference context available?"} + C -- No --> D["Local denial and log"] + C -- Yes --> E["Terminate client TLS"] + E --> F["Parse HTTP request"] + F --> G{"Known inference route?"} + G -- Yes --> H["Route through openshell-router"] + H --> I["Strip caller auth and inject provider auth/model"] + I --> J["Stream response with limits"] + G -- No --> K["403 and close"] +``` + +```mermaid +flowchart TD + A["Request to policy.local"] --> B["Policy local adapter"] + B --> C{"Local route"} + C -- "Current policy" --> D["Policy snapshot response"] + C -- "Recent denials" --> E["Bounded denial summaries"] + C -- "Policy proposal" --> F["Validate and submit proposal"] + D --> G["HTTP response"] + E --> G + F --> G +``` + +### Deployment Modes + +The first implementation can remain embedded in `openshell-sandbox`, but the +proxy should be shaped around explicit runtime contracts. + +| Mode | Shape | Main concern | +|------|-------|--------------| +| Embedded | Current sandbox process owns proxy modules | Lowest migration cost | +| Standalone process | Sandbox supervisor launches a proxy binary | Clear process/API boundary | +| Sidecar | Proxy runs outside the payload container but inside the sandbox boundary | Reliable process identity across namespaces | + +A pluggable proxy must expose the configured userland surfaces, implement the +gateway APIs it needs, and prove equivalent policy enforcement through tests. +The nftables rules that force or reject userland traffic belong to the sandbox +network boundary even if the proxy process later moves into a standalone binary +or sidecar. + +## Implementation plan + +The migration plan lives in [implementation-plan.md](implementation-plan.md). +The intended order is: first add regression coverage, then introduce the shared +authorization result and destination validation, then preserve the current +forward HTTP single-request/guarded-relay invariant, then add shared TLS +handling, TCP parser boundaries, nftables-backed policy DNS capture, local +service adapters, and finally the runtime boundary cleanup. + +## Risks + +- Tightening endpoint metadata failures from fail-open to deny may expose + latent policy or Rego errors. +- Deterministic endpoint selection may reject policies that currently load but + only work by accident. +- Transparent TCP capture adds network namespace interception complexity. +- Transparent TCP capture must coexist with the current nftables bypass + reject/log table without creating gaps where direct egress bypasses the proxy. +- Sidecar mode needs a reliable identity source for binary/path scoped policy. +- `policy.local` expands the sandbox-local control surface and needs strict + route validation, body limits, redaction, and gateway authentication. + +## Alternatives + +- Keep patching each current proxy path separately. This has the lowest short + term cost but keeps the security surface duplicated. +- Replace CONNECT with forward proxy. This does not work for arbitrary TCP and + is not a replacement for HTTPS tunnels. +- Build only transparent TCP. This helps native clients but does not replace + explicit proxy support used by common HTTP tooling. + +## Open questions + +1. Should overlapping endpoint metadata be rejected at policy load time, or + should policy name plus endpoint index define precedence? +2. Should missing TLS state fail closed for credential-capable or inspected + endpoints? +3. Should direct IP connects to a policy-DNS-resolved TCP endpoint be accepted, + or should DNS query correlation be required for stricter modes? +4. What TTL cap and stale-generation grace period should policy DNS use? +5. Which process identity source should sidecar mode use when it cannot inspect + payload process metadata through local `/proc`? +6. Which proxy capabilities should be negotiated with the gateway at startup? + +## Expected result + +Adding a new HTTP-family protocol parser should require parser code, policy +schema/Rego support, tests, and docs. It should not require new CONNECT and +forward-proxy branches. REST, GraphQL, WebSocket upgrade policy, request-body +credential rewrite, and WebSocket text-frame rewrite should all remain behind +the shared HTTP/WebSocket relay boundary. + +Adding a native TCP application parser should require policy DNS/capture +support, a TCP application parser, policy rules, tests, and docs. Plain +`protocol: tcp` remains L4 authorization plus byte relay. diff --git a/rfc/0004-sandbox-proxy-egress-adapter/current-shape.md b/rfc/0004-sandbox-proxy-egress-adapter/current-shape.md new file mode 100644 index 000000000..b428fed14 --- /dev/null +++ b/rfc/0004-sandbox-proxy-egress-adapter/current-shape.md @@ -0,0 +1,167 @@ +# Current Shape Appendix + +This appendix records the current proxy shape and the review findings that +motivate the adapter model. The main RFC intentionally keeps these details out +of the direction document. + +## Current Entry Points + +The sandbox proxy currently handles multiple userland-facing paths in the same +large module: + +- CONNECT proxy traffic for HTTPS and generic TCP tunnels. +- Forward HTTP proxy traffic for absolute-form HTTP requests. +- Local service routes such as `inference.local`. +- Network namespace bypass enforcement through nftables reject/log rules. +- Policy and endpoint metadata lookups through OPA/Rego. +- DNS resolution and endpoint validation for CONNECT and forward HTTP egress. +- Credential injection and redaction for provider-backed HTTP egress. +- Opt-in REST request-body credential rewrite. +- L7 REST, GraphQL, WebSocket, and GraphQL-over-WebSocket enforcement. + +The issue is not that these features exist. The issue is that entry mechanisms, +policy evaluation, endpoint metadata lookup, credential injection, and byte +relay decisions are interleaved. + +## Current CONNECT Shape + +```mermaid +flowchart TD + Client["Client CONNECT host:port"] --> Parse["Parse CONNECT target"] + Parse --> L4["Evaluate network policy"] + L4 --> Allowed{"Allowed?"} + Allowed -- No --> Deny["CONNECT denial"] + Allowed -- Yes --> Meta["Query endpoint metadata"] + Meta --> Config{"L7 or credential config?"} + Config -- No --> Raw["Open upstream and copy bytes"] + Config -- Yes --> Tunnel["Return tunnel-ready response"] + Tunnel --> Inspect["Parse tunneled HTTP when possible"] + Inspect --> L7["Evaluate HTTP policy"] + L7 --> Inject["Inject credentials if configured"] + Inject --> Upstream["Write upstream and relay response"] +``` + +This path has the strongest HTTP relay behavior because it can keep parsing +requests on a long-lived tunnel and enforce L7 rules per request. + +## Current Forward HTTP Shape + +```mermaid +flowchart TD + Client["Absolute-form HTTP request"] --> Parse["Parse first request"] + Parse --> L4["Evaluate network policy"] + L4 --> Allowed{"Allowed?"} + Allowed -- No --> Deny["HTTP denial"] + Allowed -- Yes --> L7{"Matching L7 endpoint?"} + L7 -- Yes --> Eval["Evaluate REST/GraphQL/WebSocket policy"] + Eval --> Rewrite["Rewrite to origin-form + configured credentials"] + L7 -- No --> Rewrite + Rewrite --> Close["Force Connection: close except WebSocket upgrade"] + Close --> Upstream["Open upstream"] + Upstream --> Relay["Guarded HTTP relay / upgrade relay"] +``` + +The latest main branch no longer has the old raw-copy-after-first-request shape +for ordinary forward HTTP. It rewrites ordinary requests with `Connection: +close`, uses guarded HTTP relay helpers for body handling, and sends allowed +WebSocket upgrades through the same upgrade relay. That is a narrower surface +than the historical bidirectional copy, but it is still implemented separately +from the CONNECT relay path. + +## Current Network Namespace Enforcement + +```mermaid +flowchart TD + Start["Process in sandbox network namespace"] --> Dest{"Destination"} + Dest -- "Proxy host_ip:port" --> Proxy["Accept to sandbox proxy"] + Dest -- "Loopback" --> Loopback["Accept loopback"] + Dest -- "Established/related" --> Established["Accept response packet"] + Dest -- "Other TCP/UDP" --> Reject["nftables log + reject"] + Reject --> Monitor["Bypass monitor reads dmesg"] + Monitor --> OCSF["OCSF network + detection events"] +``` + +The sandbox now installs an `inet` nftables filter table for bypass +enforcement. The table accepts proxy-bound traffic, loopback, and established +flows, then rejects and optionally logs other TCP/UDP traffic. It does not +currently redirect native TCP connections into the proxy. + +## Current Local Service Shape + +```mermaid +flowchart TD + Request["Request to local name"] --> Match{"Known local route?"} + Match -- "inference.local" --> Inference["Inference routing logic"] + Match -- "policy.local" --> Policy["Policy local logic"] + Match -- No --> External["Normal egress path"] + Inference --> LocalResponse["Local response"] + Policy --> LocalResponse +``` + +Local routes are userland-facing proxy surfaces. They should stay distinct from +external egress while still fitting the adapter model. + +## Findings To Preserve + +### Invariant: forward proxy must not relay unevaluated follow-on HTTP bytes + +The historical forward path evaluated at most the first absolute-form request, +rewrote it, then switched to bidirectional copy. Bytes already buffered after +the first header block, or later pipelined requests on the same client/upstream +connection, could reach upstream without the CONNECT L7 relay's per-request +parser/evaluator. + +Latest main mitigates this by forcing ordinary forward HTTP to one request per +connection and by using guarded relay helpers. The adapter model should +preserve the invariant either by keeping forward HTTP single-request/close or +by passing the first parsed request into a shared HTTP relay loop. + +### Endpoint config is not tied to deterministic matched policy + +The policy name used for L4 authorization and logging can be selected through a +different precedence rule than endpoint metadata. With overlapping host, port, +and binary rules, allowed IPs, TLS behavior, enforcement, and +`allow_encoded_slash` can come from a different endpoint than the policy name +logged and used for L4 allow. + +The adapter model requires authorization to return one decision with one +deterministic matched endpoint. + +### Endpoint metadata query failures fail open to L4 behavior + +If endpoint metadata lookup fails, callers can interpret the result as no L7 +configuration and downgrade to credential-only or raw L4 relay. + +The adapter model treats endpoint metadata as part of the authorization result. +Failure to materialize required metadata should deny rather than erase extended +configuration. + +### Control-plane port block only applies on one resolution path + +Blocked control-plane ports are enforced inside one allowed-IPs validation +path, while the normal host-based path uses a different validation route. + +The adapter model moves resolution, allowed IP checks, SSRF checks, and +control-plane port blocks into shared destination validation. + +## Existing Feature Inventory + +The refactor should preserve: + +- CONNECT explicit proxy support. +- Forward HTTP explicit proxy support. +- nftables bypass reject/log enforcement. +- Provider credential injection and redaction. +- REST request-body credential rewrite. +- WebSocket text-frame credential rewrite. +- REST endpoint method/path policy. +- GraphQL L7 policy. +- WebSocket transport and GraphQL-over-WebSocket policy. +- Inference routing through `inference.local`. +- Agent-facing policy routes through `policy.local`. +- Timeout and resource tracking for client, upstream, and local service work. +- Structured OCSF logging for network and HTTP policy outcomes. +- SSRF and internal address protections. +- Control-plane port protection. +- `allowed_ips` endpoint restrictions. +- TLS termination for inspectable client connections. diff --git a/rfc/0004-sandbox-proxy-egress-adapter/implementation-plan.md b/rfc/0004-sandbox-proxy-egress-adapter/implementation-plan.md new file mode 100644 index 000000000..94ba53b7f --- /dev/null +++ b/rfc/0004-sandbox-proxy-egress-adapter/implementation-plan.md @@ -0,0 +1,127 @@ +# Implementation Plan + +This plan is intentionally separate from the main RFC so the proposal can stay +direction-focused. + +## Phase 0 - Regression Tests + +- Add tests for forward HTTP pipelining and keep-alive follow-on requests, + including the current `Connection: close` mitigation. +- Add tests for overlapping endpoint metadata selection. +- Add tests for endpoint metadata query failures. +- Add tests for control-plane port blocking through all destination validation + paths. +- Add nftables bypass enforcement tests that verify proxy-bound traffic is + accepted while direct TCP/UDP egress is rejected and logged when available. + +## Phase 1 - Authorization Result + +- Introduce `EgressIntent` and `EgressDecision`. +- Make authorization return matched policy and matched endpoint metadata + together. +- Fail closed when required endpoint metadata cannot be materialized. +- Emit consistent OCSF network denial events from the shared boundary. + +## Phase 2 - Shared Destination Validation + +- Move DNS resolution, allowed IP filtering, SSRF checks, and control-plane port + checks into one destination validation path. +- Return an `UpstreamConnector` rather than an opened upstream socket. +- Add tests proving CONNECT, forward HTTP, and transparent TCP use the same + validation behavior. + +## Phase 3 - Forward HTTP Adapter + +- Convert forward HTTP into an adapter that parses the first absolute-form + request and builds an egress intent. +- Route the parsed first request into the shared HTTP relay or preserve the + current guarded single-request relay behavior. +- Keep the no-raw-copy invariant after the first request. + +## Phase 4 - HTTP And WebSocket Relay Consolidation + +- Centralize HTTP request parsing, REST policy, GraphQL policy, WebSocket + upgrade policy, credential resolution, redaction, request rewrite, upstream + dial, and response relay. +- Evaluate every HTTP request before upstream write. +- Ensure denied HTTP requests do not create upstream TCP sessions. +- Preserve opt-in REST request-body credential rewrite behind the shared HTTP + relay, including bounded buffering, supported content-type handling, + `Content-Length` recomputation, and fail-closed unresolved placeholders. +- Preserve WebSocket upgrade handling behind the shared relay, including + opt-in client-to-server text-frame credential rewrite, WebSocket transport + message policy, GraphQL-over-WebSocket policy, and raw passthrough for other + upgraded protocols. + +## Phase 5 - Shared TLS Termination + +- Move client-side TLS detection and termination before the HTTP/TCP relay + split. +- Keep endpoint TLS behavior on `EgressDecision`. +- Remove duplicate HTTP-specific and TCP-specific TLS termination decisions. + +## Phase 6 - TCP Relay And Parser Boundary + +- Rename raw TCP relay concepts to `TcpRelay`. +- Add a TCP application parser dispatch point for future protocol enforcement. +- Keep `protocol: tcp` as L4 authorization plus byte copy. +- Let TCP application parsers own their message loop and call the connector + when protocol state allows. + +## Phase 7 - Policy DNS And Transparent TCP + +- Add policy DNS registration for native TCP endpoint names. +- Replace static host-file mapping with query-driven DNS answers. +- Publish active DNS answer state and capture rules. +- Implement nftables REDIRECT/TPROXY capture rules ahead of the bypass reject + path; do not add a parallel iptables path. +- Implement transparent TCP adapter lookup from captured original destination + to active endpoint generation. +- Decide TTL and stale-generation behavior. + +## Phase 8 - Local Service Adapters + +- Model `inference.local` as a local adapter with TLS termination, route + validation, provider auth injection, streaming limits, and OCSF logging. +- Model `policy.local` as a local adapter for current policy, bounded denial + summaries, and policy proposals. +- Keep both paths outside normal external egress relay. + +## Phase 9 - Runtime Boundary + +- Keep embedded mode for the first migration. +- Define the proxy runtime API needed for a future standalone binary: + configured listeners, policy updates, gateway calls, telemetry, and shutdown. +- Identify process identity requirements for standalone and sidecar modes. + +## Phase 10 - Cleanup + +- Remove duplicated endpoint metadata queries from relay paths. +- Remove duplicated deny rendering where adapters can own response shape. +- Remove any remaining forward HTTP raw-copy fallback. +- Update architecture docs once implementation lands. + +## Testing Plan + +- Unit-test each adapter's intent construction and deny response shape. +- Unit-test authorization precedence for overlapping policy and endpoint rules. +- Integration-test shared destination validation across CONNECT, forward HTTP, + and transparent TCP. +- Integration-test HTTP keep-alive and pipelined requests with REST, GraphQL, + and WebSocket upgrade enforcement. +- Integration-test credential injection in L4-only HTTP and HTTP-inspected + paths. +- Integration-test REST request-body credential rewrite for JSON, + form-url-encoded, `text/*`, unsupported content types, chunked framing, body + caps, and unresolved placeholders. +- Integration-test WebSocket text-frame credential rewrite, raw upgraded + passthrough, WebSocket message policy, GraphQL-over-WebSocket policy, and + safe compression negotiation. +- Integration-test TLS termination before HTTP/TCP relay split. +- Integration-test `protocol: tcp` byte-copy behavior. +- Add parser harness tests before adding Redis, Postgres, or similar TCP + application parsers. +- Integration-test policy DNS TTL, stale generation handling, and captured + connect correlation. +- Integration-test `inference.local` and `policy.local` body limits, timeout + behavior, redaction, and local denial responses. diff --git a/rfc/0004-sandbox-proxy-egress-adapter/technical-design.md b/rfc/0004-sandbox-proxy-egress-adapter/technical-design.md new file mode 100644 index 000000000..b13e259f4 --- /dev/null +++ b/rfc/0004-sandbox-proxy-egress-adapter/technical-design.md @@ -0,0 +1,259 @@ +# Technical Design Appendix + +This appendix carries the implementation-level design details behind the main +RFC. + +## Shared Data Boundaries + +### EgressIntent + +`EgressIntent` is the normalized description of what userland is trying to do. + +It should carry: + +- entry transport: CONNECT, forward HTTP, transparent TCP, or local HTTP; +- requested destination host/port or captured original IP/port; +- process identity inputs collected by the adapter/runtime; +- optional first HTTP request for forward proxy traffic; +- optional local service route. + +Adapters build intents. They should not query endpoint metadata or select +relays. + +### EgressDecision + +`EgressDecision` is the policy result consumed by validation and relay code. + +It should carry: + +- allow or deny; +- deterministic matched policy identifier; +- deterministic matched endpoint identifier and endpoint metadata; +- process identity used for evaluation; +- destination and allowed IP constraints; +- TLS behavior; +- protocol enforcement; +- logging context and denial reason. + +Relay code should read this decision. It should not query OPA again for +endpoint metadata, TLS mode, allowed IPs, or parser selection. + +## Protocol Enforcement + +Use a protocol enforcement value derived from endpoint policy: + +| Policy protocol | Enforcement | Relay behavior | +|-----------------|-------------|----------------| +| omitted / `tcp` | None | L4 authorization plus byte relay, with optional HTTP sniff for credential injection | +| `rest` | HTTP | HTTP request parser with REST rules, plus opt-in request-body and WebSocket text-frame credential rewrite | +| `graphql` | HTTP | HTTP request parser with GraphQL rules | +| `websocket` | HTTP | HTTP upgrade policy followed by WebSocket frame policy or GraphQL-over-WebSocket policy | +| future `redis`, `postgres`, `mysql`, ... | TCP application | Protocol-specific TCP parser owns the message loop | + +`protocol: tcp` is effectively the default L4 mode. It should not run TCP +application parsers. + +Avoid using the term "provider" for these parser concepts because providers +are already a first-class credential and routing domain in OpenShell. + +## Suggested Types + +The exact Rust shape can evolve, but the boundaries should look like this: + +```rust +enum EgressTransport { + Connect, + ForwardHttp, + TransparentTcp, + LocalHttp, +} + +struct EgressIntent { + transport: EgressTransport, + destination: RequestedDestination, + process: ProcessIdentity, + first_request: Option, + local_route: Option, +} + +struct EgressDecision { + outcome: PolicyOutcome, + matched_policy: Option, + endpoint: Option, + log_context: EgressLogContext, +} + +struct MatchedEndpoint { + id: EndpointId, + allowed_ips: AllowedIpPolicy, + tls: TlsPolicy, + enforcement: ProtocolEnforcement, +} + +enum ProtocolEnforcement { + None, + Http(HttpL7Config), + TcpApplication(TcpApplicationConfig), +} + +enum HttpL7Protocol { + Rest, + Graphql, + Websocket, +} + +struct HttpL7Config { + protocol: HttpL7Protocol, + allow_encoded_slash: bool, + websocket_credential_rewrite: bool, + request_body_credential_rewrite: bool, + websocket_graphql_policy: bool, +} + +struct RelayContext { + decision: EgressDecision, + connector: UpstreamConnector, + deadlines: RelayDeadlines, + telemetry: RelayTelemetry, +} +``` + +`UpstreamConnector` is the relay-owned dial boundary. It encapsulates the +validated destination and lets relays/parsers open an upstream connection only +after protocol policy allows it. + +## Module Layout + +A future split could look like: + +| Module | Responsibility | +|--------|----------------| +| `proxy::adapter::connect` | Parse CONNECT and render CONNECT responses | +| `proxy::adapter::forward_http` | Parse absolute-form HTTP and preserve first request | +| `proxy::adapter::transparent_tcp` | Recover captured original destination | +| `proxy::adapter::policy_dns` | Answer eligible DNS queries and publish active mappings | +| `proxy::adapter::local` | Implement `inference.local` and `policy.local` surfaces | +| `proxy::auth` | Build decisions from intents and OPA results | +| `proxy::destination` | Resolve, filter, and validate destinations | +| `proxy::netfilter` | Own nftables bypass and future transparent capture rules | +| `proxy::relay::http` | HTTP request loop, credentials, REST/GraphQL/WebSocket upgrade policy | +| `proxy::relay::websocket` | WebSocket frame validation, text-frame rewrite, and message policy | +| `proxy::relay::tcp` | TCP byte relay and TCP application parser dispatch | +| `proxy::relay::tls` | Shared client-side TLS termination | +| `proxy::parser` | HTTP, WebSocket, and TCP application parser traits/config | +| `proxy::telemetry` | OCSF and tracing helpers | + +## Policy DNS And Resolved TCP State + +Policy DNS should be query-driven rather than a static `/etc/hosts` snapshot. + +1. Policy load registers eligible native TCP endpoint names. +2. Userland performs DNS lookup. +3. Policy DNS checks whether the name is registered for native TCP. +4. Policy DNS resolves through trusted upstream DNS. +5. Answers are filtered against endpoint metadata and SSRF controls. +6. The adapter publishes the DNS answer, endpoint generation, and capture rule. +7. Userland later calls `connect(ip:port)`. +8. Transparent TCP recovers the original destination and maps it to the active + endpoint generation. +9. Normal egress authorization and relay selection run. + +The resolved endpoint store is therefore not a preemptive global DNS snapshot. +It is active state produced by policy-eligible lookups and consumed by +transparent TCP connects. + +## nftables Boundary + +Current main uses nftables, not iptables, for sandbox network bypass +enforcement. The installed `inet` table accepts traffic to the sandbox proxy, +loopback, and established/related flows, then rejects and optionally logs other +TCP/UDP traffic. The bypass monitor reads those log lines and emits OCSF +network and detection events. + +Transparent TCP capture should build on this same nftables substrate: + +- capture rules must run before the generic bypass reject rules; +- capture rules should be scoped to active policy DNS IP/port mappings; +- capture state should be updated atomically with endpoint generation changes; +- reject/log rules remain the fallback for unmatched TCP/UDP egress; +- VM or Podman driver nftables rules are infrastructure NAT/isolation and + should not be treated as the proxy policy enforcement point. + +## Endpoint Selection And OPA + +OPA/Rego should return policy and endpoint metadata through one deterministic +authorization result. It should not let policy name and endpoint config be +selected by different precedence rules. + +Two acceptable approaches: + +- Reject overlapping endpoint metadata at load or merge time. +- Define a single deterministic precedence key and use it for both policy name + and endpoint metadata. + +Endpoint metadata query failures should fail closed when metadata is required +for the selected endpoint. They should not silently downgrade to L4 behavior. + +## Credential Injection Boundary + +Credential injection belongs in the HTTP relay: + +1. Authorization selects the endpoint and confirms credentials may be used. +2. The HTTP relay resolves credentials only when it has an allowed HTTP request. +3. Secrets are redacted from logs and policy-visible metadata. +4. The final upstream request or frame is rewritten with real credentials + immediately before write. + +Both L4-only HTTP and HTTP-inspected paths can inject credentials. The +difference is whether REST, GraphQL, or WebSocket policy is evaluated before +the rewrite. + +Credential rewrite slots should be explicit: + +- request target, query values, and headers for HTTP-family traffic; +- REST request bodies only when `request_body_credential_rewrite` is enabled; +- client-to-server WebSocket text frames only when + `websocket_credential_rewrite` is enabled; +- GraphQL-over-WebSocket connection/control messages when they are carried in + text frames and the endpoint enables the WebSocket rewrite path. + +Request-body rewrite is REST-only. It should buffer bounded UTF-8 textual +bodies, including JSON, form-url-encoded, and `text/*`, recompute +`Content-Length`, preserve unsupported bodies that contain no reserved +credential markers, and fail closed when a reserved placeholder cannot be +resolved safely. Binary WebSocket frames are not rewritten. + +## Parser Boundary + +Protocol parsers operate on streams owned by the relay. + +- HTTP parsing converts bytes into request metadata, evaluates request policy, + and loops for keep-alive or pipelined requests. +- WebSocket parsing starts only after an allowed HTTP upgrade. It validates the + handshake/frame stream and owns client-to-server text-frame inspection when + credential rewrite, transport message policy, GraphQL-over-WebSocket policy, + or compression handling is configured. +- TCP application parsers read client and upstream streams as needed and own + their message loop. +- A TCP parser can deny before dialing, dial for a server handshake, or keep + evaluating commands/queries throughout the session. + +This avoids a separate dial strategy enum. The parser knows which protocol +milestone is sufficient to call the validated connector. + +## Timeout And Resource Ownership + +| Owner | Resource | +|-------|----------| +| Adapter | Client-side parse timeout and adapter-specific deny response | +| Authorization | OPA deadline and policy evaluation telemetry | +| Destination validator | DNS timeout, allowed IP checks, SSRF checks, control-plane port checks | +| TLS terminator | Client TLS handshake timeout and certificate selection | +| HTTP relay | Per-request read/write deadlines, body caps, request-body rewrite caps, upstream reuse | +| WebSocket relay | Upgrade validation, frame limits, text-frame rewrite, compression limits, message policy | +| TCP relay | Byte-copy idle timeout and half-close handling | +| TCP parser | Protocol message timeouts and parser-specific limits | +| Local service adapter | Local route body limits, response caps, gateway call timeout | + +Timeouts should be recorded in telemetry at the owner boundary that can explain +the failure. From f565618e1631db49c2f61f85a26ed4d0a09510d1 Mon Sep 17 00:00:00 2001 From: John Myers Date: Thu, 21 May 2026 16:10:58 -0700 Subject: [PATCH 2/3] docs(rfc): propose sandbox proxy egress adapter model Signed-off-by: John Myers --- .../README.md | 420 ++++++++++++++++++ .../current-shape.md | 167 +++++++ .../implementation-plan.md | 127 ++++++ .../technical-design.md | 259 +++++++++++ 4 files changed, 973 insertions(+) create mode 100644 rfc/0004-sandbox-proxy-egress-adapter/README.md create mode 100644 rfc/0004-sandbox-proxy-egress-adapter/current-shape.md create mode 100644 rfc/0004-sandbox-proxy-egress-adapter/implementation-plan.md create mode 100644 rfc/0004-sandbox-proxy-egress-adapter/technical-design.md diff --git a/rfc/0004-sandbox-proxy-egress-adapter/README.md b/rfc/0004-sandbox-proxy-egress-adapter/README.md new file mode 100644 index 000000000..01001ca15 --- /dev/null +++ b/rfc/0004-sandbox-proxy-egress-adapter/README.md @@ -0,0 +1,420 @@ +--- +authors: + - "@johntmyers" +state: draft +links: + - https://github.com/NVIDIA/OpenShell/issues/1107 + - https://github.com/NVIDIA/OpenShell/pull/1083 + - https://github.com/NVIDIA/OpenShell/pull/1151 +--- + +# RFC 0004 - Sandbox Proxy Egress Adapter Model + + + +## Summary + +Refactor sandbox egress around one shared authorization and relay pipeline. +CONNECT, forward HTTP proxy, transparent native TCP, policy DNS, +`inference.local`, and `policy.local` should become adapters that translate +userland entry points into a common egress intent. Policy evaluation, +destination validation, credential injection, request-body rewrite, +WebSocket upgrade handling, protocol parsing, and relay ownership should happen +behind shared boundaries. + +This RFC keeps the main direction in this document. Supporting detail lives in: + +- [Current shape appendix](current-shape.md) +- [Technical design appendix](technical-design.md) +- [Implementation plan](implementation-plan.md) + +## Motivation + +The sandbox proxy has accumulated separate egress paths for CONNECT, forward +HTTP, local services, inference routing, endpoint metadata, credential +injection, and L7 policy. That makes security changes easy to apply to one path +and miss in another. + +The target shape separates three concerns: + +- **Adapters** describe how userland reached the proxy. +- **Authorization** decides whether that egress is allowed and what endpoint + behavior applies. +- **Relays** own bytes, credentials, protocol parsing, and upstream dialing. + +## Non-goals + +- Replace CONNECT with forward proxy as the only explicit proxy mode. +- Add SOCKS support. +- Add HTTP/2 L7 parsing in this refactor. +- Redesign provider credential storage. +- Reintroduce iptables as the sandbox packet filtering backend. +- Use eBPF connect hooks for transparent capture. Native TCP capture needs a + userland proxy in the byte stream for TLS termination and protocol parsing. + +## Proposal + +### Migration Big Rocks + +1. **Transport adapters.** CONNECT, forward HTTP, transparent TCP, policy DNS, + and local service routes become small entry adapters. They parse their + surface and produce either an egress intent, a local response, or a DNS + answer. They do not duplicate policy evaluation. +2. **Egress intent and decision.** The shared authorization boundary evaluates + L4 policy once per connection intent and returns one decision containing the + matched policy, matched endpoint, process identity, allowed IP metadata, TLS + behavior, and protocol enforcement. +3. **Relays.** Relays receive an authorized destination connector, not an + already-open upstream socket. HTTP relays evaluate every request before + dialing, own REST request-body credential rewrite, and hand allowed + WebSocket upgrades to the WebSocket relay. TCP application parsers own their + protocol loop and decide when a validated upstream connection is needed. + +### Unified Adapter Flow + +```mermaid +flowchart TD + User["Userland payload / harness"] + + subgraph ExplicitProxy["Explicit proxy listener"] + ProxyBytes["HTTP proxy bytes"] + IsConnect{"CONNECT request?"} + Connect["CONNECT adapter"] + Forward["Forward HTTP adapter"] + ProxyBytes --> IsConnect + IsConnect -- Yes --> Connect + IsConnect -- No --> Forward + end + + subgraph NativeTcp["Policy DNS + native TCP"] + NameLookup["Userland DNS lookup"] + PolicyDns["Policy DNS adapter"] + DnsAnswer["DNS answer"] + NativeConnect["Userland connect(ip:port)"] + TcpAdapter["Transparent TCP adapter"] + NameLookup --> PolicyDns + PolicyDns --> DnsAnswer + DnsAnswer --> NativeConnect + NativeConnect --> TcpAdapter + end + + subgraph LocalApis["Sandbox-local services"] + InferenceReq["Request to inference.local"] + PolicyReq["Request to policy.local"] + InferenceAdapter["Inference local adapter"] + PolicyAdapter["Policy local adapter"] + InferenceReq --> InferenceAdapter + PolicyReq --> PolicyAdapter + end + + subgraph Shared["Shared egress pipeline"] + Intent["Egress intent"] + Auth["Authorize and select endpoint"] + Decision["Egress decision"] + Validate["Resolve and validate destination"] + Relay["Relay"] + Deny["Adapter-specific deny response"] + Intent --> Auth + Auth --> Allowed{"Allowed?"} + Allowed -- No --> Deny + Allowed -- Yes --> Decision + Decision --> Validate + Validate --> Relay + end + + User --> ProxyBytes + User --> NameLookup + User --> NativeConnect + User --> InferenceReq + User --> PolicyReq + + Connect --> Intent + Forward --> Intent + TcpAdapter --> Intent + InferenceAdapter --> InferenceResp["Local inference response"] + PolicyAdapter --> PolicyResp["Local policy response"] +``` + +### Relay Flow + +```mermaid +flowchart TD + Start["Authorized egress + destination connector"] + Start --> HasFirst{"First HTTP request already parsed?"} + + HasFirst -- Yes --> ForwardMode{"Selected enforcement"} + ForwardMode -- "L4 only" --> HttpCred["HTTP relay
credential injection only"] + ForwardMode -- "HTTP rules" --> HttpL7["HTTP relay
REST/GraphQL/WebSocket policy"] + ForwardMode -- "TCP app rules" --> BadForward["Deny: HTTP request for TCP app endpoint"] + + HasFirst -- No --> Inspect["Inspect tunnel or native stream bytes"] + Inspect --> SkipTls{"Endpoint says skip TLS handling?"} + SkipTls -- Yes --> TcpBytes["TCP relay
byte copy"] + SkipTls -- No --> Peek["Peek client bytes"] + Peek --> IsTls{"TLS ClientHello?"} + IsTls -- Yes --> Tls["Shared TLS terminator"] + IsTls -- No --> Readable["Readable client stream"] + Tls --> Readable + + Readable --> Mode{"Selected enforcement"} + Mode -- "L4 only" --> SniffHttp{"Looks like HTTP?"} + SniffHttp -- Yes --> HttpCred + SniffHttp -- No --> TcpBytes + + Mode -- "HTTP rules" --> MustHttp{"Looks like HTTP?"} + MustHttp -- Yes --> HttpL7 + MustHttp -- No --> DenyHttp["Deny: expected HTTP"] + + Mode -- "TCP app rules" --> TcpParser["TCP relay
application parser owns loop"] + + HttpCred --> Creds["Resolve and redact credentials"] + HttpL7 --> CredsL7["Resolve and redact credentials"] + CredsL7 --> ParseHttp["Parse and evaluate each HTTP request"] + ParseHttp --> HttpAllowed{"Request allowed?"} + HttpAllowed -- No --> HttpDeny["Local HTTP deny
no upstream connect"] + HttpAllowed -- Yes --> Rewrite["Rewrite configured credential slots"] + Creds --> Rewrite + Rewrite --> HttpDial["Connect or reuse upstream"] + HttpDial --> HttpResponse["Write request and relay response"] + HttpResponse --> Upgrade{"101 WebSocket upgrade?"} + Upgrade -- No --> NextHttp["Continue HTTP request loop"] + Upgrade -- Yes --> WsMode{"WebSocket inspection needed?"} + WsMode -- No --> RawUpgrade["Raw upgraded stream"] + WsMode -- Yes --> WsRelay["WebSocket relay
text-frame rewrite / message policy"] + NextHttp --> ParseHttp + + TcpParser --> ParserDial["Parser dials upstream when protocol allows"] + TcpBytes --> TcpDial["Connect upstream"] + TcpDial --> ByteCopy["Copy bytes"] +``` + +Relay rules: + +- HTTP credential injection happens in both HTTP modes: L4-only HTTP and + HTTP-inspected. +- Credential injection includes request target, query, headers, opt-in REST + request bodies, and opt-in client-to-server WebSocket text frames. +- HTTP L7 policy is evaluated before upstream dial for each request. +- WebSocket upgrade policy is evaluated as HTTP first. After an allowed `101` + upgrade, the WebSocket relay owns frame parsing when text-frame credential + rewrite, WebSocket transport policy, GraphQL-over-WebSocket policy, or safe + compression handling is configured. Other upgraded streams remain raw. +- Forward HTTP must stay in the shared HTTP relay loop. It must not evaluate + one request and then switch to raw bidirectional copy. Keeping forward HTTP + single-request with `Connection: close` is also acceptable, but the invariant + is that no follow-on request bytes reach upstream unevaluated. +- `protocol: tcp` means L4 authorization plus byte copy unless HTTP is detected + for credential injection. +- Future TCP application parsers, such as Redis or Postgres, own the full + message loop and can parse multiple commands over one TCP session. + +### CONNECT Adapter + +CONNECT remains the standard explicit proxy tunnel for HTTPS and arbitrary TCP. +It parses the CONNECT line into an egress intent, then waits for the shared +relay to decide if and when an upstream connection should be opened. + +```mermaid +flowchart TD + Client["Client sends CONNECT host:port"] --> Parse["Parse target"] + Parse --> Intent["Build egress intent"] + Intent --> Auth["Shared authorization"] + Auth --> Allowed{"Destination allowed?"} + Allowed -- No --> Deny["CONNECT deny response"] + Allowed -- Yes --> Ready["Return tunnel-ready response"] + Ready --> Relay["Relay inspects tunneled bytes"] + Relay --> Dial["Relay or parser connects upstream when allowed"] +``` + +CONNECT should stay because forward proxy is only a plaintext HTTP request +format. CONNECT is still the generic explicit proxy mode for TLS and non-HTTP +TCP clients. + +### Forward HTTP Adapter + +Forward HTTP is compatibility for clients that send absolute-form HTTP requests. +The adapter parses the first request and hands it to the shared HTTP relay or +an equivalent guarded single-request relay. + +```mermaid +flowchart TD + Req["Absolute-form HTTP request"] --> Parse["Parse URI and first request"] + Parse --> Intent["Build egress intent"] + Intent --> Auth["Shared authorization"] + Auth --> Allowed{"Allowed?"} + Allowed -- No --> Deny["HTTP deny response"] + Allowed -- Yes --> Relay["Shared or guarded HTTP relay"] + Relay --> Mode{"Connection mode"} + Mode -- "Persistent" --> Loop["Evaluate every request on this connection"] + Mode -- "Single request" --> Close["Force Connection: close"] +``` + +### Transparent TCP Adapter + +Transparent TCP supports native clients that do not know they are using a +proxy. The capture mechanism should be network namespace interception into a +userland proxy listener. Since main now uses nftables for sandbox bypass +enforcement, transparent capture should be designed as nftables +REDIRECT/TPROXY state in the inner sandbox network namespace, not as an +iptables path. + +```mermaid +flowchart TD + Policy["Policy load / reload"] --> Register["Register native TCP names"] + Lookup["Userland DNS lookup"] --> Dns["Policy DNS adapter"] + Register --> Dns + Dns --> Answer["Return approved IPs"] + Answer --> Capture["Enable capture for active IP:port"] + Connect["Userland connect(ip:port)"] --> Capture + Capture --> Adapter["Transparent TCP adapter"] + Adapter --> Intent["Build egress intent from original destination"] + Intent --> Shared["Shared authorization and relay"] +``` + +### Policy DNS + +Policy DNS replaces static `/etc/hosts` snapshots for native TCP names. It is +query-driven: check whether the name is policy-eligible, resolve through trusted +DNS, filter returned IPs, publish the active endpoint mapping, and answer +userland. + +```mermaid +flowchart TD + Query["DNS query from userland"] --> Adapter["Policy DNS adapter"] + Adapter --> Known{"Registered native TCP policy name?"} + Known -- No --> Refuse["NXDOMAIN / REFUSED / SERVFAIL"] + Known -- Yes --> Upstream["Trusted upstream DNS lookup"] + Upstream --> Filter["Filter answers against endpoint policy"] + Filter --> Publish["Publish active mapping and capture rule"] + Publish --> Answer["DNS answer"] +``` + +The later `connect(ip:port)` still creates the egress intent and runs through +normal authorization. + +### Network Enforcement Substrate + +Current main uses nftables for bypass enforcement. It accepts proxy-bound +traffic and loopback, accepts established flows, then rejects and optionally +logs other TCP/UDP traffic for the bypass monitor. That is enforcement, not +native TCP capture. + +```mermaid +flowchart TD + Conn["Userland packet"] --> ProxyDest{"Proxy destination?"} + ProxyDest -- Yes --> AcceptProxy["nftables accept"] + ProxyDest -- No --> Capture{"Future native TCP capture match?"} + Capture -- Yes --> Redirect["nftables redirect/TPROXY to transparent adapter"] + Capture -- No --> Reject["nftables log + reject bypass"] + Reject --> Monitor["Bypass monitor emits OCSF"] +``` + +The transparent TCP work should extend this nftables model with explicit +capture rules that run before the reject path and are scoped to active policy +DNS mappings. + +### Local Service Adapters + +`inference.local` and `policy.local` are sandbox-local APIs. They should use +the adapter model, but they do not represent normal external egress. + +```mermaid +flowchart TD + A["Request to inference.local"] --> B["Inference local adapter"] + B --> C{"TLS and inference context available?"} + C -- No --> D["Local denial and log"] + C -- Yes --> E["Terminate client TLS"] + E --> F["Parse HTTP request"] + F --> G{"Known inference route?"} + G -- Yes --> H["Route through openshell-router"] + H --> I["Strip caller auth and inject provider auth/model"] + I --> J["Stream response with limits"] + G -- No --> K["403 and close"] +``` + +```mermaid +flowchart TD + A["Request to policy.local"] --> B["Policy local adapter"] + B --> C{"Local route"} + C -- "Current policy" --> D["Policy snapshot response"] + C -- "Recent denials" --> E["Bounded denial summaries"] + C -- "Policy proposal" --> F["Validate and submit proposal"] + D --> G["HTTP response"] + E --> G + F --> G +``` + +### Deployment Modes + +The first implementation can remain embedded in `openshell-sandbox`, but the +proxy should be shaped around explicit runtime contracts. + +| Mode | Shape | Main concern | +|------|-------|--------------| +| Embedded | Current sandbox process owns proxy modules | Lowest migration cost | +| Standalone process | Sandbox supervisor launches a proxy binary | Clear process/API boundary | +| Sidecar | Proxy runs outside the payload container but inside the sandbox boundary | Reliable process identity across namespaces | + +A pluggable proxy must expose the configured userland surfaces, implement the +gateway APIs it needs, and prove equivalent policy enforcement through tests. +The nftables rules that force or reject userland traffic belong to the sandbox +network boundary even if the proxy process later moves into a standalone binary +or sidecar. + +## Implementation plan + +The migration plan lives in [implementation-plan.md](implementation-plan.md). +The intended order is: first add regression coverage, then introduce the shared +authorization result and destination validation, then preserve the current +forward HTTP single-request/guarded-relay invariant, then add shared TLS +handling, TCP parser boundaries, nftables-backed policy DNS capture, local +service adapters, and finally the runtime boundary cleanup. + +## Risks + +- Tightening endpoint metadata failures from fail-open to deny may expose + latent policy or Rego errors. +- Deterministic endpoint selection may reject policies that currently load but + only work by accident. +- Transparent TCP capture adds network namespace interception complexity. +- Transparent TCP capture must coexist with the current nftables bypass + reject/log table without creating gaps where direct egress bypasses the proxy. +- Sidecar mode needs a reliable identity source for binary/path scoped policy. +- `policy.local` expands the sandbox-local control surface and needs strict + route validation, body limits, redaction, and gateway authentication. + +## Alternatives + +- Keep patching each current proxy path separately. This has the lowest short + term cost but keeps the security surface duplicated. +- Replace CONNECT with forward proxy. This does not work for arbitrary TCP and + is not a replacement for HTTPS tunnels. +- Build only transparent TCP. This helps native clients but does not replace + explicit proxy support used by common HTTP tooling. + +## Open questions + +1. Should overlapping endpoint metadata be rejected at policy load time, or + should policy name plus endpoint index define precedence? +2. Should missing TLS state fail closed for credential-capable or inspected + endpoints? +3. Should direct IP connects to a policy-DNS-resolved TCP endpoint be accepted, + or should DNS query correlation be required for stricter modes? +4. What TTL cap and stale-generation grace period should policy DNS use? +5. Which process identity source should sidecar mode use when it cannot inspect + payload process metadata through local `/proc`? +6. Which proxy capabilities should be negotiated with the gateway at startup? + +## Expected result + +Adding a new HTTP-family protocol parser should require parser code, policy +schema/Rego support, tests, and docs. It should not require new CONNECT and +forward-proxy branches. REST, GraphQL, WebSocket upgrade policy, request-body +credential rewrite, and WebSocket text-frame rewrite should all remain behind +the shared HTTP/WebSocket relay boundary. + +Adding a native TCP application parser should require policy DNS/capture +support, a TCP application parser, policy rules, tests, and docs. Plain +`protocol: tcp` remains L4 authorization plus byte relay. diff --git a/rfc/0004-sandbox-proxy-egress-adapter/current-shape.md b/rfc/0004-sandbox-proxy-egress-adapter/current-shape.md new file mode 100644 index 000000000..b428fed14 --- /dev/null +++ b/rfc/0004-sandbox-proxy-egress-adapter/current-shape.md @@ -0,0 +1,167 @@ +# Current Shape Appendix + +This appendix records the current proxy shape and the review findings that +motivate the adapter model. The main RFC intentionally keeps these details out +of the direction document. + +## Current Entry Points + +The sandbox proxy currently handles multiple userland-facing paths in the same +large module: + +- CONNECT proxy traffic for HTTPS and generic TCP tunnels. +- Forward HTTP proxy traffic for absolute-form HTTP requests. +- Local service routes such as `inference.local`. +- Network namespace bypass enforcement through nftables reject/log rules. +- Policy and endpoint metadata lookups through OPA/Rego. +- DNS resolution and endpoint validation for CONNECT and forward HTTP egress. +- Credential injection and redaction for provider-backed HTTP egress. +- Opt-in REST request-body credential rewrite. +- L7 REST, GraphQL, WebSocket, and GraphQL-over-WebSocket enforcement. + +The issue is not that these features exist. The issue is that entry mechanisms, +policy evaluation, endpoint metadata lookup, credential injection, and byte +relay decisions are interleaved. + +## Current CONNECT Shape + +```mermaid +flowchart TD + Client["Client CONNECT host:port"] --> Parse["Parse CONNECT target"] + Parse --> L4["Evaluate network policy"] + L4 --> Allowed{"Allowed?"} + Allowed -- No --> Deny["CONNECT denial"] + Allowed -- Yes --> Meta["Query endpoint metadata"] + Meta --> Config{"L7 or credential config?"} + Config -- No --> Raw["Open upstream and copy bytes"] + Config -- Yes --> Tunnel["Return tunnel-ready response"] + Tunnel --> Inspect["Parse tunneled HTTP when possible"] + Inspect --> L7["Evaluate HTTP policy"] + L7 --> Inject["Inject credentials if configured"] + Inject --> Upstream["Write upstream and relay response"] +``` + +This path has the strongest HTTP relay behavior because it can keep parsing +requests on a long-lived tunnel and enforce L7 rules per request. + +## Current Forward HTTP Shape + +```mermaid +flowchart TD + Client["Absolute-form HTTP request"] --> Parse["Parse first request"] + Parse --> L4["Evaluate network policy"] + L4 --> Allowed{"Allowed?"} + Allowed -- No --> Deny["HTTP denial"] + Allowed -- Yes --> L7{"Matching L7 endpoint?"} + L7 -- Yes --> Eval["Evaluate REST/GraphQL/WebSocket policy"] + Eval --> Rewrite["Rewrite to origin-form + configured credentials"] + L7 -- No --> Rewrite + Rewrite --> Close["Force Connection: close except WebSocket upgrade"] + Close --> Upstream["Open upstream"] + Upstream --> Relay["Guarded HTTP relay / upgrade relay"] +``` + +The latest main branch no longer has the old raw-copy-after-first-request shape +for ordinary forward HTTP. It rewrites ordinary requests with `Connection: +close`, uses guarded HTTP relay helpers for body handling, and sends allowed +WebSocket upgrades through the same upgrade relay. That is a narrower surface +than the historical bidirectional copy, but it is still implemented separately +from the CONNECT relay path. + +## Current Network Namespace Enforcement + +```mermaid +flowchart TD + Start["Process in sandbox network namespace"] --> Dest{"Destination"} + Dest -- "Proxy host_ip:port" --> Proxy["Accept to sandbox proxy"] + Dest -- "Loopback" --> Loopback["Accept loopback"] + Dest -- "Established/related" --> Established["Accept response packet"] + Dest -- "Other TCP/UDP" --> Reject["nftables log + reject"] + Reject --> Monitor["Bypass monitor reads dmesg"] + Monitor --> OCSF["OCSF network + detection events"] +``` + +The sandbox now installs an `inet` nftables filter table for bypass +enforcement. The table accepts proxy-bound traffic, loopback, and established +flows, then rejects and optionally logs other TCP/UDP traffic. It does not +currently redirect native TCP connections into the proxy. + +## Current Local Service Shape + +```mermaid +flowchart TD + Request["Request to local name"] --> Match{"Known local route?"} + Match -- "inference.local" --> Inference["Inference routing logic"] + Match -- "policy.local" --> Policy["Policy local logic"] + Match -- No --> External["Normal egress path"] + Inference --> LocalResponse["Local response"] + Policy --> LocalResponse +``` + +Local routes are userland-facing proxy surfaces. They should stay distinct from +external egress while still fitting the adapter model. + +## Findings To Preserve + +### Invariant: forward proxy must not relay unevaluated follow-on HTTP bytes + +The historical forward path evaluated at most the first absolute-form request, +rewrote it, then switched to bidirectional copy. Bytes already buffered after +the first header block, or later pipelined requests on the same client/upstream +connection, could reach upstream without the CONNECT L7 relay's per-request +parser/evaluator. + +Latest main mitigates this by forcing ordinary forward HTTP to one request per +connection and by using guarded relay helpers. The adapter model should +preserve the invariant either by keeping forward HTTP single-request/close or +by passing the first parsed request into a shared HTTP relay loop. + +### Endpoint config is not tied to deterministic matched policy + +The policy name used for L4 authorization and logging can be selected through a +different precedence rule than endpoint metadata. With overlapping host, port, +and binary rules, allowed IPs, TLS behavior, enforcement, and +`allow_encoded_slash` can come from a different endpoint than the policy name +logged and used for L4 allow. + +The adapter model requires authorization to return one decision with one +deterministic matched endpoint. + +### Endpoint metadata query failures fail open to L4 behavior + +If endpoint metadata lookup fails, callers can interpret the result as no L7 +configuration and downgrade to credential-only or raw L4 relay. + +The adapter model treats endpoint metadata as part of the authorization result. +Failure to materialize required metadata should deny rather than erase extended +configuration. + +### Control-plane port block only applies on one resolution path + +Blocked control-plane ports are enforced inside one allowed-IPs validation +path, while the normal host-based path uses a different validation route. + +The adapter model moves resolution, allowed IP checks, SSRF checks, and +control-plane port blocks into shared destination validation. + +## Existing Feature Inventory + +The refactor should preserve: + +- CONNECT explicit proxy support. +- Forward HTTP explicit proxy support. +- nftables bypass reject/log enforcement. +- Provider credential injection and redaction. +- REST request-body credential rewrite. +- WebSocket text-frame credential rewrite. +- REST endpoint method/path policy. +- GraphQL L7 policy. +- WebSocket transport and GraphQL-over-WebSocket policy. +- Inference routing through `inference.local`. +- Agent-facing policy routes through `policy.local`. +- Timeout and resource tracking for client, upstream, and local service work. +- Structured OCSF logging for network and HTTP policy outcomes. +- SSRF and internal address protections. +- Control-plane port protection. +- `allowed_ips` endpoint restrictions. +- TLS termination for inspectable client connections. diff --git a/rfc/0004-sandbox-proxy-egress-adapter/implementation-plan.md b/rfc/0004-sandbox-proxy-egress-adapter/implementation-plan.md new file mode 100644 index 000000000..94ba53b7f --- /dev/null +++ b/rfc/0004-sandbox-proxy-egress-adapter/implementation-plan.md @@ -0,0 +1,127 @@ +# Implementation Plan + +This plan is intentionally separate from the main RFC so the proposal can stay +direction-focused. + +## Phase 0 - Regression Tests + +- Add tests for forward HTTP pipelining and keep-alive follow-on requests, + including the current `Connection: close` mitigation. +- Add tests for overlapping endpoint metadata selection. +- Add tests for endpoint metadata query failures. +- Add tests for control-plane port blocking through all destination validation + paths. +- Add nftables bypass enforcement tests that verify proxy-bound traffic is + accepted while direct TCP/UDP egress is rejected and logged when available. + +## Phase 1 - Authorization Result + +- Introduce `EgressIntent` and `EgressDecision`. +- Make authorization return matched policy and matched endpoint metadata + together. +- Fail closed when required endpoint metadata cannot be materialized. +- Emit consistent OCSF network denial events from the shared boundary. + +## Phase 2 - Shared Destination Validation + +- Move DNS resolution, allowed IP filtering, SSRF checks, and control-plane port + checks into one destination validation path. +- Return an `UpstreamConnector` rather than an opened upstream socket. +- Add tests proving CONNECT, forward HTTP, and transparent TCP use the same + validation behavior. + +## Phase 3 - Forward HTTP Adapter + +- Convert forward HTTP into an adapter that parses the first absolute-form + request and builds an egress intent. +- Route the parsed first request into the shared HTTP relay or preserve the + current guarded single-request relay behavior. +- Keep the no-raw-copy invariant after the first request. + +## Phase 4 - HTTP And WebSocket Relay Consolidation + +- Centralize HTTP request parsing, REST policy, GraphQL policy, WebSocket + upgrade policy, credential resolution, redaction, request rewrite, upstream + dial, and response relay. +- Evaluate every HTTP request before upstream write. +- Ensure denied HTTP requests do not create upstream TCP sessions. +- Preserve opt-in REST request-body credential rewrite behind the shared HTTP + relay, including bounded buffering, supported content-type handling, + `Content-Length` recomputation, and fail-closed unresolved placeholders. +- Preserve WebSocket upgrade handling behind the shared relay, including + opt-in client-to-server text-frame credential rewrite, WebSocket transport + message policy, GraphQL-over-WebSocket policy, and raw passthrough for other + upgraded protocols. + +## Phase 5 - Shared TLS Termination + +- Move client-side TLS detection and termination before the HTTP/TCP relay + split. +- Keep endpoint TLS behavior on `EgressDecision`. +- Remove duplicate HTTP-specific and TCP-specific TLS termination decisions. + +## Phase 6 - TCP Relay And Parser Boundary + +- Rename raw TCP relay concepts to `TcpRelay`. +- Add a TCP application parser dispatch point for future protocol enforcement. +- Keep `protocol: tcp` as L4 authorization plus byte copy. +- Let TCP application parsers own their message loop and call the connector + when protocol state allows. + +## Phase 7 - Policy DNS And Transparent TCP + +- Add policy DNS registration for native TCP endpoint names. +- Replace static host-file mapping with query-driven DNS answers. +- Publish active DNS answer state and capture rules. +- Implement nftables REDIRECT/TPROXY capture rules ahead of the bypass reject + path; do not add a parallel iptables path. +- Implement transparent TCP adapter lookup from captured original destination + to active endpoint generation. +- Decide TTL and stale-generation behavior. + +## Phase 8 - Local Service Adapters + +- Model `inference.local` as a local adapter with TLS termination, route + validation, provider auth injection, streaming limits, and OCSF logging. +- Model `policy.local` as a local adapter for current policy, bounded denial + summaries, and policy proposals. +- Keep both paths outside normal external egress relay. + +## Phase 9 - Runtime Boundary + +- Keep embedded mode for the first migration. +- Define the proxy runtime API needed for a future standalone binary: + configured listeners, policy updates, gateway calls, telemetry, and shutdown. +- Identify process identity requirements for standalone and sidecar modes. + +## Phase 10 - Cleanup + +- Remove duplicated endpoint metadata queries from relay paths. +- Remove duplicated deny rendering where adapters can own response shape. +- Remove any remaining forward HTTP raw-copy fallback. +- Update architecture docs once implementation lands. + +## Testing Plan + +- Unit-test each adapter's intent construction and deny response shape. +- Unit-test authorization precedence for overlapping policy and endpoint rules. +- Integration-test shared destination validation across CONNECT, forward HTTP, + and transparent TCP. +- Integration-test HTTP keep-alive and pipelined requests with REST, GraphQL, + and WebSocket upgrade enforcement. +- Integration-test credential injection in L4-only HTTP and HTTP-inspected + paths. +- Integration-test REST request-body credential rewrite for JSON, + form-url-encoded, `text/*`, unsupported content types, chunked framing, body + caps, and unresolved placeholders. +- Integration-test WebSocket text-frame credential rewrite, raw upgraded + passthrough, WebSocket message policy, GraphQL-over-WebSocket policy, and + safe compression negotiation. +- Integration-test TLS termination before HTTP/TCP relay split. +- Integration-test `protocol: tcp` byte-copy behavior. +- Add parser harness tests before adding Redis, Postgres, or similar TCP + application parsers. +- Integration-test policy DNS TTL, stale generation handling, and captured + connect correlation. +- Integration-test `inference.local` and `policy.local` body limits, timeout + behavior, redaction, and local denial responses. diff --git a/rfc/0004-sandbox-proxy-egress-adapter/technical-design.md b/rfc/0004-sandbox-proxy-egress-adapter/technical-design.md new file mode 100644 index 000000000..b13e259f4 --- /dev/null +++ b/rfc/0004-sandbox-proxy-egress-adapter/technical-design.md @@ -0,0 +1,259 @@ +# Technical Design Appendix + +This appendix carries the implementation-level design details behind the main +RFC. + +## Shared Data Boundaries + +### EgressIntent + +`EgressIntent` is the normalized description of what userland is trying to do. + +It should carry: + +- entry transport: CONNECT, forward HTTP, transparent TCP, or local HTTP; +- requested destination host/port or captured original IP/port; +- process identity inputs collected by the adapter/runtime; +- optional first HTTP request for forward proxy traffic; +- optional local service route. + +Adapters build intents. They should not query endpoint metadata or select +relays. + +### EgressDecision + +`EgressDecision` is the policy result consumed by validation and relay code. + +It should carry: + +- allow or deny; +- deterministic matched policy identifier; +- deterministic matched endpoint identifier and endpoint metadata; +- process identity used for evaluation; +- destination and allowed IP constraints; +- TLS behavior; +- protocol enforcement; +- logging context and denial reason. + +Relay code should read this decision. It should not query OPA again for +endpoint metadata, TLS mode, allowed IPs, or parser selection. + +## Protocol Enforcement + +Use a protocol enforcement value derived from endpoint policy: + +| Policy protocol | Enforcement | Relay behavior | +|-----------------|-------------|----------------| +| omitted / `tcp` | None | L4 authorization plus byte relay, with optional HTTP sniff for credential injection | +| `rest` | HTTP | HTTP request parser with REST rules, plus opt-in request-body and WebSocket text-frame credential rewrite | +| `graphql` | HTTP | HTTP request parser with GraphQL rules | +| `websocket` | HTTP | HTTP upgrade policy followed by WebSocket frame policy or GraphQL-over-WebSocket policy | +| future `redis`, `postgres`, `mysql`, ... | TCP application | Protocol-specific TCP parser owns the message loop | + +`protocol: tcp` is effectively the default L4 mode. It should not run TCP +application parsers. + +Avoid using the term "provider" for these parser concepts because providers +are already a first-class credential and routing domain in OpenShell. + +## Suggested Types + +The exact Rust shape can evolve, but the boundaries should look like this: + +```rust +enum EgressTransport { + Connect, + ForwardHttp, + TransparentTcp, + LocalHttp, +} + +struct EgressIntent { + transport: EgressTransport, + destination: RequestedDestination, + process: ProcessIdentity, + first_request: Option, + local_route: Option, +} + +struct EgressDecision { + outcome: PolicyOutcome, + matched_policy: Option, + endpoint: Option, + log_context: EgressLogContext, +} + +struct MatchedEndpoint { + id: EndpointId, + allowed_ips: AllowedIpPolicy, + tls: TlsPolicy, + enforcement: ProtocolEnforcement, +} + +enum ProtocolEnforcement { + None, + Http(HttpL7Config), + TcpApplication(TcpApplicationConfig), +} + +enum HttpL7Protocol { + Rest, + Graphql, + Websocket, +} + +struct HttpL7Config { + protocol: HttpL7Protocol, + allow_encoded_slash: bool, + websocket_credential_rewrite: bool, + request_body_credential_rewrite: bool, + websocket_graphql_policy: bool, +} + +struct RelayContext { + decision: EgressDecision, + connector: UpstreamConnector, + deadlines: RelayDeadlines, + telemetry: RelayTelemetry, +} +``` + +`UpstreamConnector` is the relay-owned dial boundary. It encapsulates the +validated destination and lets relays/parsers open an upstream connection only +after protocol policy allows it. + +## Module Layout + +A future split could look like: + +| Module | Responsibility | +|--------|----------------| +| `proxy::adapter::connect` | Parse CONNECT and render CONNECT responses | +| `proxy::adapter::forward_http` | Parse absolute-form HTTP and preserve first request | +| `proxy::adapter::transparent_tcp` | Recover captured original destination | +| `proxy::adapter::policy_dns` | Answer eligible DNS queries and publish active mappings | +| `proxy::adapter::local` | Implement `inference.local` and `policy.local` surfaces | +| `proxy::auth` | Build decisions from intents and OPA results | +| `proxy::destination` | Resolve, filter, and validate destinations | +| `proxy::netfilter` | Own nftables bypass and future transparent capture rules | +| `proxy::relay::http` | HTTP request loop, credentials, REST/GraphQL/WebSocket upgrade policy | +| `proxy::relay::websocket` | WebSocket frame validation, text-frame rewrite, and message policy | +| `proxy::relay::tcp` | TCP byte relay and TCP application parser dispatch | +| `proxy::relay::tls` | Shared client-side TLS termination | +| `proxy::parser` | HTTP, WebSocket, and TCP application parser traits/config | +| `proxy::telemetry` | OCSF and tracing helpers | + +## Policy DNS And Resolved TCP State + +Policy DNS should be query-driven rather than a static `/etc/hosts` snapshot. + +1. Policy load registers eligible native TCP endpoint names. +2. Userland performs DNS lookup. +3. Policy DNS checks whether the name is registered for native TCP. +4. Policy DNS resolves through trusted upstream DNS. +5. Answers are filtered against endpoint metadata and SSRF controls. +6. The adapter publishes the DNS answer, endpoint generation, and capture rule. +7. Userland later calls `connect(ip:port)`. +8. Transparent TCP recovers the original destination and maps it to the active + endpoint generation. +9. Normal egress authorization and relay selection run. + +The resolved endpoint store is therefore not a preemptive global DNS snapshot. +It is active state produced by policy-eligible lookups and consumed by +transparent TCP connects. + +## nftables Boundary + +Current main uses nftables, not iptables, for sandbox network bypass +enforcement. The installed `inet` table accepts traffic to the sandbox proxy, +loopback, and established/related flows, then rejects and optionally logs other +TCP/UDP traffic. The bypass monitor reads those log lines and emits OCSF +network and detection events. + +Transparent TCP capture should build on this same nftables substrate: + +- capture rules must run before the generic bypass reject rules; +- capture rules should be scoped to active policy DNS IP/port mappings; +- capture state should be updated atomically with endpoint generation changes; +- reject/log rules remain the fallback for unmatched TCP/UDP egress; +- VM or Podman driver nftables rules are infrastructure NAT/isolation and + should not be treated as the proxy policy enforcement point. + +## Endpoint Selection And OPA + +OPA/Rego should return policy and endpoint metadata through one deterministic +authorization result. It should not let policy name and endpoint config be +selected by different precedence rules. + +Two acceptable approaches: + +- Reject overlapping endpoint metadata at load or merge time. +- Define a single deterministic precedence key and use it for both policy name + and endpoint metadata. + +Endpoint metadata query failures should fail closed when metadata is required +for the selected endpoint. They should not silently downgrade to L4 behavior. + +## Credential Injection Boundary + +Credential injection belongs in the HTTP relay: + +1. Authorization selects the endpoint and confirms credentials may be used. +2. The HTTP relay resolves credentials only when it has an allowed HTTP request. +3. Secrets are redacted from logs and policy-visible metadata. +4. The final upstream request or frame is rewritten with real credentials + immediately before write. + +Both L4-only HTTP and HTTP-inspected paths can inject credentials. The +difference is whether REST, GraphQL, or WebSocket policy is evaluated before +the rewrite. + +Credential rewrite slots should be explicit: + +- request target, query values, and headers for HTTP-family traffic; +- REST request bodies only when `request_body_credential_rewrite` is enabled; +- client-to-server WebSocket text frames only when + `websocket_credential_rewrite` is enabled; +- GraphQL-over-WebSocket connection/control messages when they are carried in + text frames and the endpoint enables the WebSocket rewrite path. + +Request-body rewrite is REST-only. It should buffer bounded UTF-8 textual +bodies, including JSON, form-url-encoded, and `text/*`, recompute +`Content-Length`, preserve unsupported bodies that contain no reserved +credential markers, and fail closed when a reserved placeholder cannot be +resolved safely. Binary WebSocket frames are not rewritten. + +## Parser Boundary + +Protocol parsers operate on streams owned by the relay. + +- HTTP parsing converts bytes into request metadata, evaluates request policy, + and loops for keep-alive or pipelined requests. +- WebSocket parsing starts only after an allowed HTTP upgrade. It validates the + handshake/frame stream and owns client-to-server text-frame inspection when + credential rewrite, transport message policy, GraphQL-over-WebSocket policy, + or compression handling is configured. +- TCP application parsers read client and upstream streams as needed and own + their message loop. +- A TCP parser can deny before dialing, dial for a server handshake, or keep + evaluating commands/queries throughout the session. + +This avoids a separate dial strategy enum. The parser knows which protocol +milestone is sufficient to call the validated connector. + +## Timeout And Resource Ownership + +| Owner | Resource | +|-------|----------| +| Adapter | Client-side parse timeout and adapter-specific deny response | +| Authorization | OPA deadline and policy evaluation telemetry | +| Destination validator | DNS timeout, allowed IP checks, SSRF checks, control-plane port checks | +| TLS terminator | Client TLS handshake timeout and certificate selection | +| HTTP relay | Per-request read/write deadlines, body caps, request-body rewrite caps, upstream reuse | +| WebSocket relay | Upgrade validation, frame limits, text-frame rewrite, compression limits, message policy | +| TCP relay | Byte-copy idle timeout and half-close handling | +| TCP parser | Protocol message timeouts and parser-specific limits | +| Local service adapter | Local route body limits, response caps, gateway call timeout | + +Timeouts should be recorded in telemetry at the owner boundary that can explain +the failure. From 03d25482b099cccd75644ec971fc760bde999bdb Mon Sep 17 00:00:00 2001 From: John Myers Date: Fri, 26 Jun 2026 09:46:49 -0700 Subject: [PATCH 3/3] docs(rfc): update sandbox proxy adapter proposal Signed-off-by: John Myers --- .../README.md | 420 ------------------ .../current-shape.md | 167 ------- .../README.md | 404 +++++++++++++++++ .../current-shape.md | 223 ++++++++++ .../implementation-plan.md | 73 ++- .../technical-design.md | 157 +++++-- 6 files changed, 802 insertions(+), 642 deletions(-) delete mode 100644 rfc/0004-sandbox-proxy-egress-adapter/README.md delete mode 100644 rfc/0004-sandbox-proxy-egress-adapter/current-shape.md create mode 100644 rfc/0005-sandbox-proxy-egress-adapter/README.md create mode 100644 rfc/0005-sandbox-proxy-egress-adapter/current-shape.md rename rfc/{0004-sandbox-proxy-egress-adapter => 0005-sandbox-proxy-egress-adapter}/implementation-plan.md (58%) rename rfc/{0004-sandbox-proxy-egress-adapter => 0005-sandbox-proxy-egress-adapter}/technical-design.md (57%) diff --git a/rfc/0004-sandbox-proxy-egress-adapter/README.md b/rfc/0004-sandbox-proxy-egress-adapter/README.md deleted file mode 100644 index 01001ca15..000000000 --- a/rfc/0004-sandbox-proxy-egress-adapter/README.md +++ /dev/null @@ -1,420 +0,0 @@ ---- -authors: - - "@johntmyers" -state: draft -links: - - https://github.com/NVIDIA/OpenShell/issues/1107 - - https://github.com/NVIDIA/OpenShell/pull/1083 - - https://github.com/NVIDIA/OpenShell/pull/1151 ---- - -# RFC 0004 - Sandbox Proxy Egress Adapter Model - - - -## Summary - -Refactor sandbox egress around one shared authorization and relay pipeline. -CONNECT, forward HTTP proxy, transparent native TCP, policy DNS, -`inference.local`, and `policy.local` should become adapters that translate -userland entry points into a common egress intent. Policy evaluation, -destination validation, credential injection, request-body rewrite, -WebSocket upgrade handling, protocol parsing, and relay ownership should happen -behind shared boundaries. - -This RFC keeps the main direction in this document. Supporting detail lives in: - -- [Current shape appendix](current-shape.md) -- [Technical design appendix](technical-design.md) -- [Implementation plan](implementation-plan.md) - -## Motivation - -The sandbox proxy has accumulated separate egress paths for CONNECT, forward -HTTP, local services, inference routing, endpoint metadata, credential -injection, and L7 policy. That makes security changes easy to apply to one path -and miss in another. - -The target shape separates three concerns: - -- **Adapters** describe how userland reached the proxy. -- **Authorization** decides whether that egress is allowed and what endpoint - behavior applies. -- **Relays** own bytes, credentials, protocol parsing, and upstream dialing. - -## Non-goals - -- Replace CONNECT with forward proxy as the only explicit proxy mode. -- Add SOCKS support. -- Add HTTP/2 L7 parsing in this refactor. -- Redesign provider credential storage. -- Reintroduce iptables as the sandbox packet filtering backend. -- Use eBPF connect hooks for transparent capture. Native TCP capture needs a - userland proxy in the byte stream for TLS termination and protocol parsing. - -## Proposal - -### Migration Big Rocks - -1. **Transport adapters.** CONNECT, forward HTTP, transparent TCP, policy DNS, - and local service routes become small entry adapters. They parse their - surface and produce either an egress intent, a local response, or a DNS - answer. They do not duplicate policy evaluation. -2. **Egress intent and decision.** The shared authorization boundary evaluates - L4 policy once per connection intent and returns one decision containing the - matched policy, matched endpoint, process identity, allowed IP metadata, TLS - behavior, and protocol enforcement. -3. **Relays.** Relays receive an authorized destination connector, not an - already-open upstream socket. HTTP relays evaluate every request before - dialing, own REST request-body credential rewrite, and hand allowed - WebSocket upgrades to the WebSocket relay. TCP application parsers own their - protocol loop and decide when a validated upstream connection is needed. - -### Unified Adapter Flow - -```mermaid -flowchart TD - User["Userland payload / harness"] - - subgraph ExplicitProxy["Explicit proxy listener"] - ProxyBytes["HTTP proxy bytes"] - IsConnect{"CONNECT request?"} - Connect["CONNECT adapter"] - Forward["Forward HTTP adapter"] - ProxyBytes --> IsConnect - IsConnect -- Yes --> Connect - IsConnect -- No --> Forward - end - - subgraph NativeTcp["Policy DNS + native TCP"] - NameLookup["Userland DNS lookup"] - PolicyDns["Policy DNS adapter"] - DnsAnswer["DNS answer"] - NativeConnect["Userland connect(ip:port)"] - TcpAdapter["Transparent TCP adapter"] - NameLookup --> PolicyDns - PolicyDns --> DnsAnswer - DnsAnswer --> NativeConnect - NativeConnect --> TcpAdapter - end - - subgraph LocalApis["Sandbox-local services"] - InferenceReq["Request to inference.local"] - PolicyReq["Request to policy.local"] - InferenceAdapter["Inference local adapter"] - PolicyAdapter["Policy local adapter"] - InferenceReq --> InferenceAdapter - PolicyReq --> PolicyAdapter - end - - subgraph Shared["Shared egress pipeline"] - Intent["Egress intent"] - Auth["Authorize and select endpoint"] - Decision["Egress decision"] - Validate["Resolve and validate destination"] - Relay["Relay"] - Deny["Adapter-specific deny response"] - Intent --> Auth - Auth --> Allowed{"Allowed?"} - Allowed -- No --> Deny - Allowed -- Yes --> Decision - Decision --> Validate - Validate --> Relay - end - - User --> ProxyBytes - User --> NameLookup - User --> NativeConnect - User --> InferenceReq - User --> PolicyReq - - Connect --> Intent - Forward --> Intent - TcpAdapter --> Intent - InferenceAdapter --> InferenceResp["Local inference response"] - PolicyAdapter --> PolicyResp["Local policy response"] -``` - -### Relay Flow - -```mermaid -flowchart TD - Start["Authorized egress + destination connector"] - Start --> HasFirst{"First HTTP request already parsed?"} - - HasFirst -- Yes --> ForwardMode{"Selected enforcement"} - ForwardMode -- "L4 only" --> HttpCred["HTTP relay
credential injection only"] - ForwardMode -- "HTTP rules" --> HttpL7["HTTP relay
REST/GraphQL/WebSocket policy"] - ForwardMode -- "TCP app rules" --> BadForward["Deny: HTTP request for TCP app endpoint"] - - HasFirst -- No --> Inspect["Inspect tunnel or native stream bytes"] - Inspect --> SkipTls{"Endpoint says skip TLS handling?"} - SkipTls -- Yes --> TcpBytes["TCP relay
byte copy"] - SkipTls -- No --> Peek["Peek client bytes"] - Peek --> IsTls{"TLS ClientHello?"} - IsTls -- Yes --> Tls["Shared TLS terminator"] - IsTls -- No --> Readable["Readable client stream"] - Tls --> Readable - - Readable --> Mode{"Selected enforcement"} - Mode -- "L4 only" --> SniffHttp{"Looks like HTTP?"} - SniffHttp -- Yes --> HttpCred - SniffHttp -- No --> TcpBytes - - Mode -- "HTTP rules" --> MustHttp{"Looks like HTTP?"} - MustHttp -- Yes --> HttpL7 - MustHttp -- No --> DenyHttp["Deny: expected HTTP"] - - Mode -- "TCP app rules" --> TcpParser["TCP relay
application parser owns loop"] - - HttpCred --> Creds["Resolve and redact credentials"] - HttpL7 --> CredsL7["Resolve and redact credentials"] - CredsL7 --> ParseHttp["Parse and evaluate each HTTP request"] - ParseHttp --> HttpAllowed{"Request allowed?"} - HttpAllowed -- No --> HttpDeny["Local HTTP deny
no upstream connect"] - HttpAllowed -- Yes --> Rewrite["Rewrite configured credential slots"] - Creds --> Rewrite - Rewrite --> HttpDial["Connect or reuse upstream"] - HttpDial --> HttpResponse["Write request and relay response"] - HttpResponse --> Upgrade{"101 WebSocket upgrade?"} - Upgrade -- No --> NextHttp["Continue HTTP request loop"] - Upgrade -- Yes --> WsMode{"WebSocket inspection needed?"} - WsMode -- No --> RawUpgrade["Raw upgraded stream"] - WsMode -- Yes --> WsRelay["WebSocket relay
text-frame rewrite / message policy"] - NextHttp --> ParseHttp - - TcpParser --> ParserDial["Parser dials upstream when protocol allows"] - TcpBytes --> TcpDial["Connect upstream"] - TcpDial --> ByteCopy["Copy bytes"] -``` - -Relay rules: - -- HTTP credential injection happens in both HTTP modes: L4-only HTTP and - HTTP-inspected. -- Credential injection includes request target, query, headers, opt-in REST - request bodies, and opt-in client-to-server WebSocket text frames. -- HTTP L7 policy is evaluated before upstream dial for each request. -- WebSocket upgrade policy is evaluated as HTTP first. After an allowed `101` - upgrade, the WebSocket relay owns frame parsing when text-frame credential - rewrite, WebSocket transport policy, GraphQL-over-WebSocket policy, or safe - compression handling is configured. Other upgraded streams remain raw. -- Forward HTTP must stay in the shared HTTP relay loop. It must not evaluate - one request and then switch to raw bidirectional copy. Keeping forward HTTP - single-request with `Connection: close` is also acceptable, but the invariant - is that no follow-on request bytes reach upstream unevaluated. -- `protocol: tcp` means L4 authorization plus byte copy unless HTTP is detected - for credential injection. -- Future TCP application parsers, such as Redis or Postgres, own the full - message loop and can parse multiple commands over one TCP session. - -### CONNECT Adapter - -CONNECT remains the standard explicit proxy tunnel for HTTPS and arbitrary TCP. -It parses the CONNECT line into an egress intent, then waits for the shared -relay to decide if and when an upstream connection should be opened. - -```mermaid -flowchart TD - Client["Client sends CONNECT host:port"] --> Parse["Parse target"] - Parse --> Intent["Build egress intent"] - Intent --> Auth["Shared authorization"] - Auth --> Allowed{"Destination allowed?"} - Allowed -- No --> Deny["CONNECT deny response"] - Allowed -- Yes --> Ready["Return tunnel-ready response"] - Ready --> Relay["Relay inspects tunneled bytes"] - Relay --> Dial["Relay or parser connects upstream when allowed"] -``` - -CONNECT should stay because forward proxy is only a plaintext HTTP request -format. CONNECT is still the generic explicit proxy mode for TLS and non-HTTP -TCP clients. - -### Forward HTTP Adapter - -Forward HTTP is compatibility for clients that send absolute-form HTTP requests. -The adapter parses the first request and hands it to the shared HTTP relay or -an equivalent guarded single-request relay. - -```mermaid -flowchart TD - Req["Absolute-form HTTP request"] --> Parse["Parse URI and first request"] - Parse --> Intent["Build egress intent"] - Intent --> Auth["Shared authorization"] - Auth --> Allowed{"Allowed?"} - Allowed -- No --> Deny["HTTP deny response"] - Allowed -- Yes --> Relay["Shared or guarded HTTP relay"] - Relay --> Mode{"Connection mode"} - Mode -- "Persistent" --> Loop["Evaluate every request on this connection"] - Mode -- "Single request" --> Close["Force Connection: close"] -``` - -### Transparent TCP Adapter - -Transparent TCP supports native clients that do not know they are using a -proxy. The capture mechanism should be network namespace interception into a -userland proxy listener. Since main now uses nftables for sandbox bypass -enforcement, transparent capture should be designed as nftables -REDIRECT/TPROXY state in the inner sandbox network namespace, not as an -iptables path. - -```mermaid -flowchart TD - Policy["Policy load / reload"] --> Register["Register native TCP names"] - Lookup["Userland DNS lookup"] --> Dns["Policy DNS adapter"] - Register --> Dns - Dns --> Answer["Return approved IPs"] - Answer --> Capture["Enable capture for active IP:port"] - Connect["Userland connect(ip:port)"] --> Capture - Capture --> Adapter["Transparent TCP adapter"] - Adapter --> Intent["Build egress intent from original destination"] - Intent --> Shared["Shared authorization and relay"] -``` - -### Policy DNS - -Policy DNS replaces static `/etc/hosts` snapshots for native TCP names. It is -query-driven: check whether the name is policy-eligible, resolve through trusted -DNS, filter returned IPs, publish the active endpoint mapping, and answer -userland. - -```mermaid -flowchart TD - Query["DNS query from userland"] --> Adapter["Policy DNS adapter"] - Adapter --> Known{"Registered native TCP policy name?"} - Known -- No --> Refuse["NXDOMAIN / REFUSED / SERVFAIL"] - Known -- Yes --> Upstream["Trusted upstream DNS lookup"] - Upstream --> Filter["Filter answers against endpoint policy"] - Filter --> Publish["Publish active mapping and capture rule"] - Publish --> Answer["DNS answer"] -``` - -The later `connect(ip:port)` still creates the egress intent and runs through -normal authorization. - -### Network Enforcement Substrate - -Current main uses nftables for bypass enforcement. It accepts proxy-bound -traffic and loopback, accepts established flows, then rejects and optionally -logs other TCP/UDP traffic for the bypass monitor. That is enforcement, not -native TCP capture. - -```mermaid -flowchart TD - Conn["Userland packet"] --> ProxyDest{"Proxy destination?"} - ProxyDest -- Yes --> AcceptProxy["nftables accept"] - ProxyDest -- No --> Capture{"Future native TCP capture match?"} - Capture -- Yes --> Redirect["nftables redirect/TPROXY to transparent adapter"] - Capture -- No --> Reject["nftables log + reject bypass"] - Reject --> Monitor["Bypass monitor emits OCSF"] -``` - -The transparent TCP work should extend this nftables model with explicit -capture rules that run before the reject path and are scoped to active policy -DNS mappings. - -### Local Service Adapters - -`inference.local` and `policy.local` are sandbox-local APIs. They should use -the adapter model, but they do not represent normal external egress. - -```mermaid -flowchart TD - A["Request to inference.local"] --> B["Inference local adapter"] - B --> C{"TLS and inference context available?"} - C -- No --> D["Local denial and log"] - C -- Yes --> E["Terminate client TLS"] - E --> F["Parse HTTP request"] - F --> G{"Known inference route?"} - G -- Yes --> H["Route through openshell-router"] - H --> I["Strip caller auth and inject provider auth/model"] - I --> J["Stream response with limits"] - G -- No --> K["403 and close"] -``` - -```mermaid -flowchart TD - A["Request to policy.local"] --> B["Policy local adapter"] - B --> C{"Local route"} - C -- "Current policy" --> D["Policy snapshot response"] - C -- "Recent denials" --> E["Bounded denial summaries"] - C -- "Policy proposal" --> F["Validate and submit proposal"] - D --> G["HTTP response"] - E --> G - F --> G -``` - -### Deployment Modes - -The first implementation can remain embedded in `openshell-sandbox`, but the -proxy should be shaped around explicit runtime contracts. - -| Mode | Shape | Main concern | -|------|-------|--------------| -| Embedded | Current sandbox process owns proxy modules | Lowest migration cost | -| Standalone process | Sandbox supervisor launches a proxy binary | Clear process/API boundary | -| Sidecar | Proxy runs outside the payload container but inside the sandbox boundary | Reliable process identity across namespaces | - -A pluggable proxy must expose the configured userland surfaces, implement the -gateway APIs it needs, and prove equivalent policy enforcement through tests. -The nftables rules that force or reject userland traffic belong to the sandbox -network boundary even if the proxy process later moves into a standalone binary -or sidecar. - -## Implementation plan - -The migration plan lives in [implementation-plan.md](implementation-plan.md). -The intended order is: first add regression coverage, then introduce the shared -authorization result and destination validation, then preserve the current -forward HTTP single-request/guarded-relay invariant, then add shared TLS -handling, TCP parser boundaries, nftables-backed policy DNS capture, local -service adapters, and finally the runtime boundary cleanup. - -## Risks - -- Tightening endpoint metadata failures from fail-open to deny may expose - latent policy or Rego errors. -- Deterministic endpoint selection may reject policies that currently load but - only work by accident. -- Transparent TCP capture adds network namespace interception complexity. -- Transparent TCP capture must coexist with the current nftables bypass - reject/log table without creating gaps where direct egress bypasses the proxy. -- Sidecar mode needs a reliable identity source for binary/path scoped policy. -- `policy.local` expands the sandbox-local control surface and needs strict - route validation, body limits, redaction, and gateway authentication. - -## Alternatives - -- Keep patching each current proxy path separately. This has the lowest short - term cost but keeps the security surface duplicated. -- Replace CONNECT with forward proxy. This does not work for arbitrary TCP and - is not a replacement for HTTPS tunnels. -- Build only transparent TCP. This helps native clients but does not replace - explicit proxy support used by common HTTP tooling. - -## Open questions - -1. Should overlapping endpoint metadata be rejected at policy load time, or - should policy name plus endpoint index define precedence? -2. Should missing TLS state fail closed for credential-capable or inspected - endpoints? -3. Should direct IP connects to a policy-DNS-resolved TCP endpoint be accepted, - or should DNS query correlation be required for stricter modes? -4. What TTL cap and stale-generation grace period should policy DNS use? -5. Which process identity source should sidecar mode use when it cannot inspect - payload process metadata through local `/proc`? -6. Which proxy capabilities should be negotiated with the gateway at startup? - -## Expected result - -Adding a new HTTP-family protocol parser should require parser code, policy -schema/Rego support, tests, and docs. It should not require new CONNECT and -forward-proxy branches. REST, GraphQL, WebSocket upgrade policy, request-body -credential rewrite, and WebSocket text-frame rewrite should all remain behind -the shared HTTP/WebSocket relay boundary. - -Adding a native TCP application parser should require policy DNS/capture -support, a TCP application parser, policy rules, tests, and docs. Plain -`protocol: tcp` remains L4 authorization plus byte relay. diff --git a/rfc/0004-sandbox-proxy-egress-adapter/current-shape.md b/rfc/0004-sandbox-proxy-egress-adapter/current-shape.md deleted file mode 100644 index b428fed14..000000000 --- a/rfc/0004-sandbox-proxy-egress-adapter/current-shape.md +++ /dev/null @@ -1,167 +0,0 @@ -# Current Shape Appendix - -This appendix records the current proxy shape and the review findings that -motivate the adapter model. The main RFC intentionally keeps these details out -of the direction document. - -## Current Entry Points - -The sandbox proxy currently handles multiple userland-facing paths in the same -large module: - -- CONNECT proxy traffic for HTTPS and generic TCP tunnels. -- Forward HTTP proxy traffic for absolute-form HTTP requests. -- Local service routes such as `inference.local`. -- Network namespace bypass enforcement through nftables reject/log rules. -- Policy and endpoint metadata lookups through OPA/Rego. -- DNS resolution and endpoint validation for CONNECT and forward HTTP egress. -- Credential injection and redaction for provider-backed HTTP egress. -- Opt-in REST request-body credential rewrite. -- L7 REST, GraphQL, WebSocket, and GraphQL-over-WebSocket enforcement. - -The issue is not that these features exist. The issue is that entry mechanisms, -policy evaluation, endpoint metadata lookup, credential injection, and byte -relay decisions are interleaved. - -## Current CONNECT Shape - -```mermaid -flowchart TD - Client["Client CONNECT host:port"] --> Parse["Parse CONNECT target"] - Parse --> L4["Evaluate network policy"] - L4 --> Allowed{"Allowed?"} - Allowed -- No --> Deny["CONNECT denial"] - Allowed -- Yes --> Meta["Query endpoint metadata"] - Meta --> Config{"L7 or credential config?"} - Config -- No --> Raw["Open upstream and copy bytes"] - Config -- Yes --> Tunnel["Return tunnel-ready response"] - Tunnel --> Inspect["Parse tunneled HTTP when possible"] - Inspect --> L7["Evaluate HTTP policy"] - L7 --> Inject["Inject credentials if configured"] - Inject --> Upstream["Write upstream and relay response"] -``` - -This path has the strongest HTTP relay behavior because it can keep parsing -requests on a long-lived tunnel and enforce L7 rules per request. - -## Current Forward HTTP Shape - -```mermaid -flowchart TD - Client["Absolute-form HTTP request"] --> Parse["Parse first request"] - Parse --> L4["Evaluate network policy"] - L4 --> Allowed{"Allowed?"} - Allowed -- No --> Deny["HTTP denial"] - Allowed -- Yes --> L7{"Matching L7 endpoint?"} - L7 -- Yes --> Eval["Evaluate REST/GraphQL/WebSocket policy"] - Eval --> Rewrite["Rewrite to origin-form + configured credentials"] - L7 -- No --> Rewrite - Rewrite --> Close["Force Connection: close except WebSocket upgrade"] - Close --> Upstream["Open upstream"] - Upstream --> Relay["Guarded HTTP relay / upgrade relay"] -``` - -The latest main branch no longer has the old raw-copy-after-first-request shape -for ordinary forward HTTP. It rewrites ordinary requests with `Connection: -close`, uses guarded HTTP relay helpers for body handling, and sends allowed -WebSocket upgrades through the same upgrade relay. That is a narrower surface -than the historical bidirectional copy, but it is still implemented separately -from the CONNECT relay path. - -## Current Network Namespace Enforcement - -```mermaid -flowchart TD - Start["Process in sandbox network namespace"] --> Dest{"Destination"} - Dest -- "Proxy host_ip:port" --> Proxy["Accept to sandbox proxy"] - Dest -- "Loopback" --> Loopback["Accept loopback"] - Dest -- "Established/related" --> Established["Accept response packet"] - Dest -- "Other TCP/UDP" --> Reject["nftables log + reject"] - Reject --> Monitor["Bypass monitor reads dmesg"] - Monitor --> OCSF["OCSF network + detection events"] -``` - -The sandbox now installs an `inet` nftables filter table for bypass -enforcement. The table accepts proxy-bound traffic, loopback, and established -flows, then rejects and optionally logs other TCP/UDP traffic. It does not -currently redirect native TCP connections into the proxy. - -## Current Local Service Shape - -```mermaid -flowchart TD - Request["Request to local name"] --> Match{"Known local route?"} - Match -- "inference.local" --> Inference["Inference routing logic"] - Match -- "policy.local" --> Policy["Policy local logic"] - Match -- No --> External["Normal egress path"] - Inference --> LocalResponse["Local response"] - Policy --> LocalResponse -``` - -Local routes are userland-facing proxy surfaces. They should stay distinct from -external egress while still fitting the adapter model. - -## Findings To Preserve - -### Invariant: forward proxy must not relay unevaluated follow-on HTTP bytes - -The historical forward path evaluated at most the first absolute-form request, -rewrote it, then switched to bidirectional copy. Bytes already buffered after -the first header block, or later pipelined requests on the same client/upstream -connection, could reach upstream without the CONNECT L7 relay's per-request -parser/evaluator. - -Latest main mitigates this by forcing ordinary forward HTTP to one request per -connection and by using guarded relay helpers. The adapter model should -preserve the invariant either by keeping forward HTTP single-request/close or -by passing the first parsed request into a shared HTTP relay loop. - -### Endpoint config is not tied to deterministic matched policy - -The policy name used for L4 authorization and logging can be selected through a -different precedence rule than endpoint metadata. With overlapping host, port, -and binary rules, allowed IPs, TLS behavior, enforcement, and -`allow_encoded_slash` can come from a different endpoint than the policy name -logged and used for L4 allow. - -The adapter model requires authorization to return one decision with one -deterministic matched endpoint. - -### Endpoint metadata query failures fail open to L4 behavior - -If endpoint metadata lookup fails, callers can interpret the result as no L7 -configuration and downgrade to credential-only or raw L4 relay. - -The adapter model treats endpoint metadata as part of the authorization result. -Failure to materialize required metadata should deny rather than erase extended -configuration. - -### Control-plane port block only applies on one resolution path - -Blocked control-plane ports are enforced inside one allowed-IPs validation -path, while the normal host-based path uses a different validation route. - -The adapter model moves resolution, allowed IP checks, SSRF checks, and -control-plane port blocks into shared destination validation. - -## Existing Feature Inventory - -The refactor should preserve: - -- CONNECT explicit proxy support. -- Forward HTTP explicit proxy support. -- nftables bypass reject/log enforcement. -- Provider credential injection and redaction. -- REST request-body credential rewrite. -- WebSocket text-frame credential rewrite. -- REST endpoint method/path policy. -- GraphQL L7 policy. -- WebSocket transport and GraphQL-over-WebSocket policy. -- Inference routing through `inference.local`. -- Agent-facing policy routes through `policy.local`. -- Timeout and resource tracking for client, upstream, and local service work. -- Structured OCSF logging for network and HTTP policy outcomes. -- SSRF and internal address protections. -- Control-plane port protection. -- `allowed_ips` endpoint restrictions. -- TLS termination for inspectable client connections. diff --git a/rfc/0005-sandbox-proxy-egress-adapter/README.md b/rfc/0005-sandbox-proxy-egress-adapter/README.md new file mode 100644 index 000000000..cbbb97374 --- /dev/null +++ b/rfc/0005-sandbox-proxy-egress-adapter/README.md @@ -0,0 +1,404 @@ +--- +authors: + - "@johntmyers" +state: draft +links: + - https://github.com/NVIDIA/OpenShell/issues/1107 + - https://github.com/NVIDIA/OpenShell/pull/1083 + - https://github.com/NVIDIA/OpenShell/pull/1151 + - https://github.com/NVIDIA/OpenShell/pull/1286 + - https://github.com/NVIDIA/OpenShell/pull/1511 +--- + +# RFC 0005 - Sandbox Proxy Egress Adapter Model + + + +## Summary + +Refactor sandbox egress around one shared authorization and relay pipeline. +CONNECT, forward HTTP, native TCP capture, policy DNS, `inference.local`, +`policy.local`, and metadata loopback should become narrow adapters that +translate userland entry points into common runtime intents. Policy evaluation, +destination validation, credential injection, request-body rewrite, WebSocket +handling, protocol parsing, and upstream dialing should happen behind shared +boundaries. + +The codebase has already moved in this direction by splitting networking into +`openshell-supervisor-network` and process/netns work into +`openshell-supervisor-process`. This RFC proposes the next internal boundary: +make proxy entry mechanisms pluggable without duplicating authorization, +destination validation, or relay behavior. + +Supporting detail lives in: + +- [Current shape appendix](current-shape.md) +- [Technical design appendix](technical-design.md) +- [Implementation plan](implementation-plan.md) + +## Motivation + +The sandbox proxy supports several connection surfaces: explicit CONNECT, +forward HTTP, local inference and policy APIs, metadata loopback, TLS +termination, REST and GraphQL inspection, WebSocket inspection, credential +injection, and nftables-backed bypass detection. These features are valuable, +but changes to policy and enforcement still tend to touch multiple entry paths. + +The risk is asymmetric enforcement. A security fix can be added to CONNECT and +missed in forward HTTP; endpoint metadata can be selected differently from the +logged policy; a credential path can gain request-body or WebSocket support +without the same behavior existing in another relay mode. + +The target shape separates three concerns: + +- **Adapters** describe how userland reached the networking component. +- **Authorization** decides whether the egress is allowed and what endpoint + behavior applies. +- **Relays** own bytes, credentials, protocol parsing, and upstream dialing. + +This also prepares the proxy for future deployment modes. Today the proxy runs +inside the sandbox supervisor process. The networking leaf can already run in a +network-only mode, and a future standalone binary or sidecar should be possible +if it implements the same userland surfaces, gateway APIs, and policy +enforcement contracts. + +## Non-goals + +- Replace CONNECT with forward proxy as the only explicit proxy mode. +- Add SOCKS support. +- Add HTTP/2 L7 parsing in this refactor. Inspected HTTP paths should continue + to reject unsupported h2c upgrades instead of silently upgrading to raw + traffic. +- Redesign provider credential storage. +- Reintroduce iptables as the sandbox packet filtering backend. +- Use eBPF connect hooks for transparent capture. Native TCP capture needs a + userland proxy in the byte stream for TLS termination and protocol parsing. + +## Proposal + +### Migration Big Rocks + +1. **Transport and local-service adapters.** CONNECT, forward HTTP, + transparent TCP, policy DNS, `inference.local`, `policy.local`, and metadata + loopback become small adapters. They parse their surface and produce either + an egress intent, a local response, or a DNS answer. They do not duplicate + policy evaluation. +2. **Egress intent and decision.** Shared authorization evaluates L4 policy and + endpoint selection once per connection intent and returns one decision + containing the matched policy, matched endpoint, process identity, allowed + IP metadata, TLS behavior, protocol enforcement, and credential injection + plan. +3. **Relays.** Relays receive an authorized destination connector, not an + already-open upstream socket. HTTP relays evaluate every request before + upstream write. TCP relays copy bytes for L4-only endpoints or hand the + stream to a protocol parser when endpoint policy requires native protocol + enforcement. + +### Unified Adapter Flow + +```mermaid +flowchart TD + User["Userland payload / harness"] + + subgraph ExplicitProxy["Explicit proxy listener"] + ProxyBytes["HTTP proxy bytes"] + IsConnect{"CONNECT request?"} + Connect["CONNECT adapter"] + Forward["Forward HTTP adapter"] + ProxyBytes --> IsConnect + IsConnect -- Yes --> Connect + IsConnect -- No --> Forward + end + + subgraph NativeTcp["Policy DNS + native TCP"] + NameLookup["Userland DNS lookup"] + PolicyDns["Policy DNS adapter"] + DnsAnswer["DNS answer + active mapping"] + NativeConnect["Userland connect(ip:port)"] + TcpAdapter["Transparent TCP adapter"] + NameLookup --> PolicyDns + PolicyDns --> DnsAnswer + DnsAnswer --> NativeConnect + NativeConnect --> TcpAdapter + end + + subgraph LocalApis["Sandbox-local services"] + InferenceReq["Request to inference.local"] + PolicyReq["Request to policy.local"] + MetadataReq["Request to metadata loopback"] + InferenceAdapter["Inference local adapter"] + PolicyAdapter["Policy local adapter"] + MetadataAdapter["Metadata loopback adapter"] + InferenceReq --> InferenceAdapter + PolicyReq --> PolicyAdapter + MetadataReq --> MetadataAdapter + end + + subgraph Shared["Shared external egress pipeline"] + Intent["EgressIntent"] + Auth["Authorize and select endpoint"] + Decision["EgressDecision"] + Validate["Resolve and validate destination"] + Relay["Relay"] + Deny["Adapter-specific deny response"] + Intent --> Auth + Auth --> Allowed{"Allowed?"} + Allowed -- No --> Deny + Allowed -- Yes --> Decision + Decision --> Validate + Validate --> Relay + end + + User --> ProxyBytes + User --> NameLookup + User --> NativeConnect + User --> InferenceReq + User --> PolicyReq + User --> MetadataReq + + Connect --> Intent + Forward --> Intent + TcpAdapter --> Intent + InferenceAdapter --> InferenceResp["Local inference response"] + PolicyAdapter --> PolicyResp["Local policy response"] + MetadataAdapter --> MetadataResp["Local metadata credential response"] +``` + +Each adapter still owns its response shape. If authorization denies a CONNECT +intent, the CONNECT adapter returns a tunnel denial. If forward HTTP is denied, +the forward adapter returns an HTTP denial. If policy DNS refuses a name, it +returns the appropriate DNS response. The shared layer decides the outcome; the +adapter renders it for its protocol. + +### Relay Flow + +```mermaid +flowchart TD + Start["Authorized egress + destination connector"] + Start --> FirstReq{"First HTTP request already parsed?"} + + FirstReq -- Yes --> ForwardMode{"decision.endpoint.enforcement"} + ForwardMode -- "None" --> HttpCred["HTTP relay
credential injection only"] + ForwardMode -- "Http" --> HttpL7["HTTP relay
REST/GraphQL/WebSocket policy"] + ForwardMode -- "TcpApplication" --> BadForward["Deny: HTTP request for TCP app endpoint"] + + FirstReq -- No --> TlsPolicy{"TLS handling skipped?"} + TlsPolicy -- Yes --> Readable["Readable client stream"] + TlsPolicy -- No --> Peek["Peek client bytes"] + Peek --> Tls{"TLS ClientHello?"} + Tls -- Yes --> Terminate["Shared TLS terminator"] + Tls -- No --> Readable + Terminate --> Readable + + Readable --> Enforce{"decision.endpoint.enforcement"} + Enforce -- "None" --> Sniff{"Looks like HTTP?"} + Sniff -- Yes --> HttpCred + Sniff -- No --> TcpRelay["TcpRelay
byte copy"] + + Enforce -- "Http" --> MustHttp{"Looks like HTTP?"} + MustHttp -- Yes --> HttpL7 + MustHttp -- No --> DenyHttp["Deny: expected HTTP"] + + Enforce -- "TcpApplication" --> TcpParser["TcpRelay
protocol parser owns loop"] + + HttpCred --> ReqLoop["HTTP request loop"] + HttpL7 --> ReqLoop + ReqLoop --> ReqPolicy{"Request allowed?"} + ReqPolicy -- No --> ReqDeny["Local HTTP deny
no upstream write"] + ReqPolicy -- Yes --> StaticCreds["Resolve static placeholders"] + StaticCreds --> TokenGrant["Apply endpoint token grant if configured"] + TokenGrant --> Rewrite["Rewrite configured credential slots"] + Rewrite --> HttpDial["Connect or reuse upstream"] + HttpDial --> HttpResponse["Write request and relay response"] + HttpResponse --> Upgrade{"101 WebSocket upgrade?"} + Upgrade -- No --> ReqLoop + Upgrade -- Yes --> WsInspect{"WebSocket inspection or rewrite configured?"} + WsInspect -- No --> RawUpgrade["Raw upgraded stream"] + WsInspect -- Yes --> WsRelay["WebSocket relay
text-frame rewrite / message policy"] + + TcpParser --> ParserDial["Parser calls connector when protocol allows"] + TcpRelay --> TcpDial["Connect upstream"] + TcpDial --> ByteCopy["Copy bytes"] +``` + +Relay rules: + +- HTTP credential injection happens in both HTTP modes: L4-only HTTP and + HTTP-inspected. +- Credential injection includes static placeholder rewrite and endpoint-bound + dynamic token grants. Token grants run after policy allow and before upstream + write; failures deny without forwarding the request. +- Static credential rewrite covers request target, query, headers, opt-in REST + request bodies, and opt-in client-to-server WebSocket text frames. +- HTTP L7 policy is evaluated before upstream write for each request. +- WebSocket upgrade policy is evaluated as HTTP first. After an allowed `101` + upgrade, the WebSocket relay owns frame parsing when text-frame credential + rewrite, WebSocket transport policy, GraphQL-over-WebSocket policy, or safe + compression handling is configured. Other upgraded streams remain raw. +- Forward HTTP must stay in the shared HTTP relay loop or in an equivalent + guarded single-request relay. It must not evaluate one request and then + switch to raw bidirectional copy. +- `protocol: tcp` or an omitted protocol means L4 authorization plus byte copy, + except that HTTP-looking streams may still use HTTP credential injection. +- Future native protocol parsers, such as Redis, Postgres, or MySQL, own the + full message loop and can parse multiple commands or queries on one TCP + session. + +### Adapter Responsibilities + +CONNECT remains the generic explicit proxy mode for HTTPS and arbitrary TCP. +The CONNECT adapter parses `CONNECT host:port` into an `EgressIntent`, asks the +shared authorization boundary for an `EgressDecision`, returns the tunnel-ready +response only after the connection is allowed, and then hands the tunnel to the +relay. The upstream connection is opened by the HTTP relay or TCP parser when +payload policy allows it. + +Forward HTTP is compatibility for clients that send absolute-form HTTP +requests. The adapter parses the first request, rewrites proxy framing only at +the relay boundary, rejects `https://` absolute-form requests, rejects +unsupported h2c upgrades on inspected routes, and either stays in a shared HTTP +request loop or forces `Connection: close` for a guarded single request. + +Transparent TCP is for native clients that do not know they are using a proxy. +It depends on policy DNS and nftables capture: DNS answers create active +endpoint mappings, userland later calls `connect(ip:port)`, nftables redirects +matching traffic to a userland listener, and the TCP adapter recovers the +original destination before building an intent. + +Policy DNS replaces static `/etc/hosts` snapshots for native TCP names. It is +query-driven: check whether the name is policy-eligible, resolve through +trusted DNS, filter returned IPs, publish the active endpoint mapping, and +answer userland. The later `connect(ip:port)` still runs through normal +authorization. + +Local service adapters stay outside the normal external egress relay: +`inference.local` routes chat, completion, model discovery, embeddings, and +provider-specific inference traffic through the router with local limits; +`policy.local` exposes current policy, denial summaries, proposal submission, +and proposal wait routes; metadata loopback serves provider metadata +credentials to SDKs that bypass HTTP proxy variables. + +### Network Enforcement Substrate + +Current main uses nftables for sandbox bypass enforcement. It accepts +proxy-bound traffic, loopback, and established flows, then rejects and +optionally logs other TCP/UDP traffic for the bypass monitor. That is +enforcement, not native TCP capture. + +```mermaid +flowchart TD + Packet["Userland packet"] --> ProxyDest{"Proxy destination?"} + ProxyDest -- Yes --> AcceptProxy["nftables accept"] + ProxyDest -- No --> Capture{"Future native TCP capture match?"} + Capture -- Yes --> Redirect["nftables redirect/TPROXY to transparent adapter"] + Capture -- No --> Reject["nftables log + reject bypass"] + Reject --> Monitor["Bypass monitor emits OCSF"] +``` + +Transparent TCP work should extend this nftables model with explicit capture +rules that run before the reject path and are scoped to active policy DNS +mappings. It should not add a parallel iptables path. + +### Deployment Modes + +| Mode | Shape | Status | +|------|-------|--------| +| Embedded supervisor | `openshell-sandbox` orchestrates `openshell-supervisor-network` and `openshell-supervisor-process` | Current | +| Network-only supervisor | Networking, policy, proxy, local services, and background tasks run without a payload process leaf | Current runtime mode | +| Standalone proxy binary | Supervisor launches networking as a separate process with explicit APIs | Future packaging/API work | +| Sidecar proxy | Proxy runs outside the payload container but inside the sandbox boundary | Future isolation mode | + +A pluggable proxy must expose the right userland surfaces, implement the +gateway APIs it needs, and prove equivalent policy enforcement through tests. +The nftables rules that force or reject userland traffic belong to the sandbox +network boundary even if the proxy process later moves into a standalone binary +or sidecar. + +## Implementation plan + +The detailed migration plan lives in [implementation-plan.md](implementation-plan.md). +The intended order is: + +1. Add regression coverage around the current split, forward HTTP invariants, + endpoint selection, token grants, WebSocket/body rewrite, metadata loopback, + and nftables bypass enforcement. +2. Introduce `EgressIntent` and `EgressDecision` inside + `openshell-supervisor-network`. +3. Move destination validation and endpoint metadata materialization behind the + shared decision and connector boundary. +4. Consolidate forward HTTP, CONNECT HTTP inspection, credential injection, + request-body rewrite, and WebSocket handling behind shared HTTP/WebSocket + relay code. +5. Move TLS detection and termination ahead of the HTTP/TCP relay split. +6. Add the TCP relay/parser boundary, then policy DNS and native TCP capture. +7. Treat local services and deployment modes as explicit runtime contracts. + +## Risks + +- Tightening endpoint metadata failures from fail-open to deny may expose + latent policy or Rego errors. +- Deterministic endpoint selection may reject policies that currently load but + only work by accident. +- Token grants add a runtime dependency on SPIFFE Workload API and token + endpoints. Failures should remain fail-closed and sanitized. +- Transparent TCP capture adds network namespace interception complexity and + must coexist with the nftables bypass reject/log table. +- Sidecar mode needs a reliable identity source for binary/path scoped policy. +- Metadata loopback and `policy.local` expand sandbox-local control surfaces + and need strict route validation, body limits, redaction, and authentication + boundaries. +- Provider-composed policy rules use a reserved namespace. Decisions and logs + must distinguish provider-derived policy from user-authored policy without + exposing provider rules as editable sandbox proposals. + +## Alternatives + +### Keep patching each entry path + +This has the lowest short-term cost but keeps security behavior duplicated +across CONNECT, forward HTTP, and local services. It also makes future TCP +application protocol support harder because each parser must be wired through +multiple entry mechanisms. + +### Replace CONNECT with forward proxy + +Forward proxy only covers plaintext absolute-form HTTP requests. It is not a +replacement for HTTPS tunnels, WebSocket tunnels, or arbitrary TCP clients. +CONNECT should remain the generic explicit proxy mode. + +### Build only transparent TCP + +Transparent TCP helps native clients but does not replace explicit proxy +support used by common HTTP tooling. It also requires policy DNS and nftables +capture before it can safely preserve endpoint identity. + +## Prior art + +The current `openshell-supervisor-network` split is the immediate prior step: +it already separates proxy, OPA, L7, inference routing, policy-local routes, +TLS, and token grants from process supervision. + +The current `openshell-supervisor-process` netns and bypass monitor are the +packet-enforcement substrate. Transparent TCP should extend that nftables +model rather than creating a second firewall path. + +The existing L7 relay is the behavioral prior art for this RFC. It already +proves per-request HTTP evaluation, GraphQL parsing, WebSocket frame handling, +request-body rewrite, and token-grant injection can live behind relay +boundaries. + +## Open questions + +1. Should overlapping endpoint metadata be rejected at policy load time, or + should policy name plus endpoint index define precedence? +2. Should direct IP connects to a policy-DNS-resolved TCP endpoint be accepted, + or should DNS query correlation be required for stricter modes? +3. What TTL cap and stale-generation grace period should policy DNS use? +4. Which process identity source should sidecar mode use when it cannot inspect + payload process metadata through local `/proc`? +5. Which proxy capabilities should be negotiated with the gateway at startup? +6. Should metadata loopback be modeled as an adapter inside + `openshell-supervisor-network`, or remain orchestrated by `openshell-sandbox` + with shared credential/provider helpers? diff --git a/rfc/0005-sandbox-proxy-egress-adapter/current-shape.md b/rfc/0005-sandbox-proxy-egress-adapter/current-shape.md new file mode 100644 index 000000000..d4090e4f6 --- /dev/null +++ b/rfc/0005-sandbox-proxy-egress-adapter/current-shape.md @@ -0,0 +1,223 @@ +# Current Shape Appendix + +This appendix records the current proxy shape and the review findings that +motivate the adapter model. The main RFC intentionally keeps these details out +of the direction document. + +## Current Runtime Split + +The proxy is no longer only a large module inside `openshell-sandbox`. +Current main has three relevant runtime owners: + +```mermaid +flowchart TD + Sandbox["openshell-sandbox
orchestrator"] + Network["openshell-supervisor-network
proxy, OPA, L7, TLS, inference,
policy.local, token grants"] + Process["openshell-supervisor-process
process leaf, SSH, netns,
nftables, bypass monitor"] + Denials["Denial/activity aggregators"] + Gateway["Gateway policy/provider APIs"] + + Sandbox --> Network + Sandbox --> Process + Network --> Denials + Process --> Denials + Sandbox --> Gateway + Network --> Gateway +``` + +`openshell-sandbox` creates the shared network namespace, owns denial/activity +channels, starts the policy poll loop, starts networking, starts the metadata +loopback server when needed, and then optionally starts the process leaf. If +`process_enabled` is false, the supervisor can run in network-only mode and +keep networking/background tasks alive until shutdown. + +`openshell-supervisor-network` owns the explicit proxy listener, OPA engine +integration, L7 enforcement, TLS termination, inference routing, policy-local +routes, identity cache, provider credential injection, and token grants. + +`openshell-supervisor-process` owns process execution, SSH, network namespace +helpers, nftables bypass rules, and the bypass monitor that turns nftables LOG +entries into OCSF events. + +## Current Userland-Facing Surfaces + +The networking surface currently includes: + +- CONNECT proxy traffic for HTTPS and generic TCP tunnels. +- Forward HTTP proxy traffic for absolute-form HTTP requests. +- `inference.local` for local inference routing. +- `policy.local` for current policy, denial summaries, proposal submission, + and proposal wait routes. +- GCE metadata loopback for SDKs that bypass HTTP proxy variables. +- nftables bypass enforcement for direct TCP/UDP egress that does not enter + the proxy. +- OPA/Rego policy and endpoint metadata lookups. +- DNS resolution and endpoint validation for CONNECT and forward HTTP egress. +- Static provider credential injection and redaction. +- Endpoint-bound dynamic token grant injection. +- Opt-in REST request-body credential rewrite. +- L7 REST, GraphQL, WebSocket, and GraphQL-over-WebSocket enforcement. + +The issue is not that these features exist. The issue is that entry mechanisms, +policy evaluation, endpoint metadata lookup, credential injection, and byte +relay decisions are still interleaved. + +## Current CONNECT Shape + +```mermaid +flowchart TD + Client["Client CONNECT host:port"] --> Parse["Parse CONNECT target"] + Parse --> L4["Evaluate network policy"] + L4 --> Allowed{"Allowed?"} + Allowed -- No --> Deny["CONNECT denial"] + Allowed -- Yes --> Meta["Query endpoint metadata"] + Meta --> Config{"L7, TLS, or credential config?"} + Config -- No --> Tunnel["Return tunnel-ready response"] + Config -- Yes --> Tunnel + Tunnel --> Inspect["Inspect tunneled bytes when possible"] + Inspect --> Relay["HTTP/WebSocket/TCP relay selection"] + Relay --> Inject["Static credentials and token grants if configured"] + Inject --> Upstream["Open upstream when relay policy allows"] +``` + +CONNECT is still the strongest entry shape because the tunnel relay can keep +parsing HTTP requests on long-lived connections and enforce request policy per +request. + +## Current Forward HTTP Shape + +```mermaid +flowchart TD + Client["Absolute-form HTTP request"] --> Parse["Parse first request"] + Parse --> L4["Evaluate network policy"] + L4 --> Allowed{"Allowed?"} + Allowed -- No --> Deny["HTTP denial"] + Allowed -- Yes --> L7{"Matching L7 endpoint?"} + L7 -- Yes --> Eval["Evaluate REST/GraphQL/WebSocket policy"] + Eval --> Guard["Reject unsupported h2c upgrade when inspected"] + Guard --> Rewrite["Rewrite to origin-form + configured credentials"] + L7 -- No --> Rewrite + Rewrite --> Token["Apply token grant if endpoint-bound"] + Token --> Close["Force Connection: close except WebSocket upgrade"] + Close --> Upstream["Open upstream"] + Upstream --> Relay["Guarded HTTP relay / upgrade relay"] +``` + +Latest main no longer has the old raw-copy-after-first-request shape for +ordinary forward HTTP. It rewrites ordinary requests with `Connection: close`, +uses guarded HTTP relay helpers for body handling, rejects inspected h2c +upgrades, injects token grants, and sends allowed WebSocket upgrades through +the upgrade relay. That is a narrower surface than the historical bidirectional +copy, but it is still orchestrated separately from the CONNECT relay path. + +## Current Local Service Shape + +```mermaid +flowchart TD + Request["Request to local name"] --> Match{"Known local route?"} + Match -- "inference.local" --> Inference["Inference route adapter"] + Match -- "policy.local" --> Policy["Policy local adapter"] + Match -- "metadata loopback" --> Metadata["Metadata credential server"] + Match -- No --> External["Normal egress path"] + Inference --> InferenceResp["Local inference response"] + Policy --> PolicyResp["Local policy response"] + Metadata --> MetadataResp["Metadata response"] +``` + +`inference.local` now covers buffered and streaming inference shapes including +chat/completion routes, model discovery, embeddings, and provider-specific +routes. `policy.local` supports the agentic approval loop: agents can submit +narrow proposals and wait on approval/reload before retrying. Metadata +loopback exists for provider credentials consumed by SDKs that do not honor +HTTP proxy variables. + +These are userland-facing network surfaces. They should stay distinct from +external egress while still fitting the adapter model. + +## Current Network Namespace Enforcement + +```mermaid +flowchart TD + Start["Process in sandbox network namespace"] --> Dest{"Destination"} + Dest -- "Proxy host_ip:port" --> Proxy["Accept to sandbox proxy"] + Dest -- "Loopback" --> Loopback["Accept loopback"] + Dest -- "Established/related" --> Established["Accept response packet"] + Dest -- "Other TCP/UDP" --> Reject["nftables log + reject"] + Reject --> Monitor["Bypass monitor reads dmesg"] + Monitor --> OCSF["OCSF network + detection events"] +``` + +The process leaf installs an `inet` nftables filter table for bypass +enforcement. The table accepts proxy-bound traffic, loopback, and established +flows, then rejects and optionally logs other TCP/UDP traffic. It does not +currently redirect native TCP connections into the proxy. + +## Findings To Preserve + +### Invariant: forward proxy must not relay unevaluated follow-on HTTP bytes + +The historical forward path evaluated at most the first absolute-form request, +rewrote it, then switched to bidirectional copy. Bytes already buffered after +the first header block, or later pipelined requests on the same client/upstream +connection, could reach upstream without the CONNECT L7 relay's per-request +parser/evaluator. + +Latest main mitigates this by forcing ordinary forward HTTP to one request per +connection and by using guarded relay helpers. The adapter model should +preserve the invariant either by keeping forward HTTP single-request/close or +by passing the first parsed request into a shared HTTP relay loop. + +### Endpoint config is not tied to deterministic matched policy + +The policy name used for L4 authorization and logging can be selected through a +different precedence rule than endpoint metadata. With overlapping host, port, +and binary rules, allowed IPs, TLS behavior, enforcement, and +`allow_encoded_slash` can come from a different endpoint than the policy name +logged and used for L4 allow. + +The adapter model requires authorization to return one decision with one +deterministic matched endpoint. + +### Endpoint metadata query failures should not erase enforcement + +If endpoint metadata lookup fails, callers can interpret the result as no L7 +configuration and downgrade to credential-only or raw L4 relay. + +The adapter model treats endpoint metadata as part of the authorization result. +Failure to materialize required metadata should deny rather than erase extended +configuration. + +### Destination validation must be shared + +Private address checks, `allowed_ips`, exact declared private endpoint trust, +trusted gateway aliases, SSRF checks, and control-plane port blocks have grown +over time. They should be centralized so CONNECT, forward HTTP, future +transparent TCP, and local-service egress use the same resolved-destination +rules. + +## Existing Feature Inventory + +The refactor should preserve: + +- CONNECT explicit proxy support. +- Forward HTTP explicit proxy support. +- Network-only supervisor mode. +- nftables bypass reject/log enforcement. +- Provider credential injection and redaction. +- Dynamic token grant injection through SPIFFE-backed provider credentials. +- REST request-body credential rewrite. +- WebSocket text-frame credential rewrite. +- REST endpoint method/path policy. +- GraphQL-over-HTTP policy. +- WebSocket transport and GraphQL-over-WebSocket policy. +- h2c rejection on inspected HTTP routes. +- Inference routing through `inference.local`, including embeddings. +- Agent-facing policy advisor routes through `policy.local`. +- GCE metadata loopback for supported provider credentials. +- Timeout and resource tracking for client, upstream, and local service work. +- Structured OCSF logging for network and HTTP policy outcomes. +- SSRF and internal address protections. +- Exact declared private endpoint handling. +- Control-plane port protection. +- `allowed_ips` endpoint restrictions. +- TLS auto-detection and termination for inspectable client connections. diff --git a/rfc/0004-sandbox-proxy-egress-adapter/implementation-plan.md b/rfc/0005-sandbox-proxy-egress-adapter/implementation-plan.md similarity index 58% rename from rfc/0004-sandbox-proxy-egress-adapter/implementation-plan.md rename to rfc/0005-sandbox-proxy-egress-adapter/implementation-plan.md index 94ba53b7f..3b00a512b 100644 --- a/rfc/0004-sandbox-proxy-egress-adapter/implementation-plan.md +++ b/rfc/0005-sandbox-proxy-egress-adapter/implementation-plan.md @@ -7,28 +7,46 @@ direction-focused. - Add tests for forward HTTP pipelining and keep-alive follow-on requests, including the current `Connection: close` mitigation. +- Add tests for forward HTTP h2c rejection on inspected endpoints. - Add tests for overlapping endpoint metadata selection. - Add tests for endpoint metadata query failures. - Add tests for control-plane port blocking through all destination validation paths. +- Add tests for exact declared private endpoint trust and `allowed_ips` + behavior across CONNECT and forward HTTP. +- Add tests proving static credential injection works in L4-only HTTP and + HTTP-inspected paths. +- Add tests proving token grant success injects the configured header and token + grant failure does not forward upstream. +- Add tests for REST request-body credential rewrite, WebSocket text-frame + credential rewrite, WebSocket GraphQL policy, and compression handling. +- Add tests for `policy.local` proposal wait behavior and `inference.local` + buffered/streaming route limits. +- Add tests for metadata loopback startup/failure behavior when provider + credentials require it. - Add nftables bypass enforcement tests that verify proxy-bound traffic is accepted while direct TCP/UDP egress is rejected and logged when available. ## Phase 1 - Authorization Result -- Introduce `EgressIntent` and `EgressDecision`. +- Introduce `EgressIntent` and `EgressDecision` inside + `openshell-supervisor-network`. - Make authorization return matched policy and matched endpoint metadata together. +- Include policy source on the decision: user-authored, provider-derived, or + local-service internal. +- Include protocol enforcement and credential injection plan on the decision. - Fail closed when required endpoint metadata cannot be materialized. - Emit consistent OCSF network denial events from the shared boundary. ## Phase 2 - Shared Destination Validation -- Move DNS resolution, allowed IP filtering, SSRF checks, and control-plane port - checks into one destination validation path. +- Move DNS resolution, allowed IP filtering, SSRF checks, exact declared + endpoint handling, trusted gateway aliases, and control-plane port checks + into one destination validation path. - Return an `UpstreamConnector` rather than an opened upstream socket. -- Add tests proving CONNECT, forward HTTP, and transparent TCP use the same - validation behavior. +- Add tests proving CONNECT, forward HTTP, and future transparent TCP use the + same validation behavior. ## Phase 3 - Forward HTTP Adapter @@ -36,15 +54,20 @@ direction-focused. request and builds an egress intent. - Route the parsed first request into the shared HTTP relay or preserve the current guarded single-request relay behavior. +- Preserve `https://` absolute-form rejection. +- Preserve h2c rejection on inspected routes. - Keep the no-raw-copy invariant after the first request. -## Phase 4 - HTTP And WebSocket Relay Consolidation +## Phase 4 - HTTP, WebSocket, And Credential Relay Consolidation - Centralize HTTP request parsing, REST policy, GraphQL policy, WebSocket upgrade policy, credential resolution, redaction, request rewrite, upstream dial, and response relay. - Evaluate every HTTP request before upstream write. - Ensure denied HTTP requests do not create upstream TCP sessions. +- Preserve static placeholder rewrite for target, query, and headers. +- Preserve dynamic token grant injection after request allow and before + upstream write. - Preserve opt-in REST request-body credential rewrite behind the shared HTTP relay, including bounded buffering, supported content-type handling, `Content-Length` recomputation, and fail-closed unresolved placeholders. @@ -58,13 +81,14 @@ direction-focused. - Move client-side TLS detection and termination before the HTTP/TCP relay split. - Keep endpoint TLS behavior on `EgressDecision`. +- Treat `tls: skip` as the explicit opt-out for TLS handling. - Remove duplicate HTTP-specific and TCP-specific TLS termination decisions. ## Phase 6 - TCP Relay And Parser Boundary -- Rename raw TCP relay concepts to `TcpRelay`. +- Use `TcpRelay` for byte relay and TCP application parser dispatch. +- Keep `protocol: tcp` or omitted protocol as L4 authorization plus byte copy. - Add a TCP application parser dispatch point for future protocol enforcement. -- Keep `protocol: tcp` as L4 authorization plus byte copy. - Let TCP application parsers own their message loop and call the connector when protocol state allows. @@ -75,6 +99,7 @@ direction-focused. - Publish active DNS answer state and capture rules. - Implement nftables REDIRECT/TPROXY capture rules ahead of the bypass reject path; do not add a parallel iptables path. +- Coordinate capture rule ownership with `openshell-supervisor-process::netns`. - Implement transparent TCP adapter lookup from captured original destination to active endpoint generation. - Decide TTL and stale-generation behavior. @@ -82,35 +107,51 @@ direction-focused. ## Phase 8 - Local Service Adapters - Model `inference.local` as a local adapter with TLS termination, route - validation, provider auth injection, streaming limits, and OCSF logging. + validation, provider auth injection, streaming/buffered limits, and OCSF + logging. - Model `policy.local` as a local adapter for current policy, bounded denial - summaries, and policy proposals. -- Keep both paths outside normal external egress relay. + summaries, policy proposals, and proposal wait. +- Decide whether metadata loopback remains orchestrated in `openshell-sandbox` + or moves behind a local adapter boundary in `openshell-supervisor-network`. +- Keep these paths outside normal external egress relay while preserving + credential redaction and route validation. ## Phase 9 - Runtime Boundary -- Keep embedded mode for the first migration. +- Keep embedded supervisor mode as the first migration target. +- Treat the existing `openshell-supervisor-network` and + `openshell-supervisor-process` split as the structural baseline. - Define the proxy runtime API needed for a future standalone binary: - configured listeners, policy updates, gateway calls, telemetry, and shutdown. + configured listeners, policy updates, provider credentials, token grants, + gateway calls, telemetry, denial/activity events, and shutdown. - Identify process identity requirements for standalone and sidecar modes. +- Add capability negotiation with the gateway if standalone proxy versions can + differ from gateway versions. ## Phase 10 - Cleanup - Remove duplicated endpoint metadata queries from relay paths. -- Remove duplicated deny rendering where adapters can own response shape. +- Remove duplicated destination validation and deny rendering where adapters + can own response shape. - Remove any remaining forward HTTP raw-copy fallback. +- Remove stale references to iptables or static `/etc/hosts` native TCP + mapping from proxy design docs. - Update architecture docs once implementation lands. ## Testing Plan - Unit-test each adapter's intent construction and deny response shape. - Unit-test authorization precedence for overlapping policy and endpoint rules. +- Unit-test provider-derived rule namespace handling and `policy.local` + filtering. - Integration-test shared destination validation across CONNECT, forward HTTP, and transparent TCP. - Integration-test HTTP keep-alive and pipelined requests with REST, GraphQL, and WebSocket upgrade enforcement. - Integration-test credential injection in L4-only HTTP and HTTP-inspected paths. +- Integration-test token grant success, cache hit, malformed token, resolver + unavailable, and token endpoint failure. - Integration-test REST request-body credential rewrite for JSON, form-url-encoded, `text/*`, unsupported content types, chunked framing, body caps, and unresolved placeholders. @@ -123,5 +164,5 @@ direction-focused. application parsers. - Integration-test policy DNS TTL, stale generation handling, and captured connect correlation. -- Integration-test `inference.local` and `policy.local` body limits, timeout - behavior, redaction, and local denial responses. +- Integration-test `inference.local`, `policy.local`, and metadata loopback + body limits, timeout behavior, redaction, and local denial responses. diff --git a/rfc/0004-sandbox-proxy-egress-adapter/technical-design.md b/rfc/0005-sandbox-proxy-egress-adapter/technical-design.md similarity index 57% rename from rfc/0004-sandbox-proxy-egress-adapter/technical-design.md rename to rfc/0005-sandbox-proxy-egress-adapter/technical-design.md index b13e259f4..837a638d9 100644 --- a/rfc/0004-sandbox-proxy-egress-adapter/technical-design.md +++ b/rfc/0005-sandbox-proxy-egress-adapter/technical-design.md @@ -1,7 +1,19 @@ # Technical Design Appendix -This appendix carries the implementation-level design details behind the main -RFC. +This appendix carries implementation-level design details behind the main RFC. + +## Existing Runtime Boundary + +`openshell-supervisor-network::run::run_networking` is the current networking +startup boundary. It builds policy-local context, waits for policy binary +symlink resolution, creates the identity cache, writes the TLS CA, builds TLS +state, resolves inference routes, wires provider credentials and token grants, +and starts the proxy. + +This is a useful outer boundary, but it is not yet the proxy adapter boundary. +The proxy still needs internal `EgressIntent` and `EgressDecision` boundaries +so CONNECT, forward HTTP, local routes, and future native TCP capture do not +duplicate policy and relay orchestration. ## Shared Data Boundaries @@ -11,14 +23,16 @@ RFC. It should carry: -- entry transport: CONNECT, forward HTTP, transparent TCP, or local HTTP; +- entry transport: CONNECT, forward HTTP, transparent TCP, local HTTP, policy + DNS, or metadata loopback; - requested destination host/port or captured original IP/port; - process identity inputs collected by the adapter/runtime; - optional first HTTP request for forward proxy traffic; -- optional local service route. +- optional local service route; +- policy generation or DNS mapping generation when relevant. -Adapters build intents. They should not query endpoint metadata or select -relays. +Adapters build intents. They should not query endpoint metadata, select TLS +mode, or select relays. ### EgressDecision @@ -28,15 +42,19 @@ It should carry: - allow or deny; - deterministic matched policy identifier; +- whether the policy is user-authored, provider-derived, or local-service + internal; - deterministic matched endpoint identifier and endpoint metadata; - process identity used for evaluation; - destination and allowed IP constraints; - TLS behavior; - protocol enforcement; +- credential injection plan; - logging context and denial reason. Relay code should read this decision. It should not query OPA again for -endpoint metadata, TLS mode, allowed IPs, or parser selection. +endpoint metadata, TLS mode, allowed IPs, credential behavior, or parser +selection. ## Protocol Enforcement @@ -46,15 +64,14 @@ Use a protocol enforcement value derived from endpoint policy: |-----------------|-------------|----------------| | omitted / `tcp` | None | L4 authorization plus byte relay, with optional HTTP sniff for credential injection | | `rest` | HTTP | HTTP request parser with REST rules, plus opt-in request-body and WebSocket text-frame credential rewrite | -| `graphql` | HTTP | HTTP request parser with GraphQL rules | +| `graphql` | HTTP | HTTP request parser with GraphQL-over-HTTP rules | | `websocket` | HTTP | HTTP upgrade policy followed by WebSocket frame policy or GraphQL-over-WebSocket policy | | future `redis`, `postgres`, `mysql`, ... | TCP application | Protocol-specific TCP parser owns the message loop | `protocol: tcp` is effectively the default L4 mode. It should not run TCP -application parsers. - -Avoid using the term "provider" for these parser concepts because providers -are already a first-class credential and routing domain in OpenShell. +application parsers. Avoid using the term "provider" for parser concepts +because providers are already a first-class credential and routing domain in +OpenShell. ## Suggested Types @@ -65,7 +82,9 @@ enum EgressTransport { Connect, ForwardHttp, TransparentTcp, + PolicyDns, LocalHttp, + MetadataLoopback, } struct EgressIntent { @@ -74,6 +93,7 @@ struct EgressIntent { process: ProcessIdentity, first_request: Option, local_route: Option, + generation: Option, } struct EgressDecision { @@ -83,11 +103,23 @@ struct EgressDecision { log_context: EgressLogContext, } +struct MatchedPolicy { + id: PolicyId, + source: PolicySource, +} + +enum PolicySource { + User, + ProviderDerived, + LocalService, +} + struct MatchedEndpoint { id: EndpointId, allowed_ips: AllowedIpPolicy, tls: TlsPolicy, enforcement: ProtocolEnforcement, + credentials: CredentialInjectionPlan, } enum ProtocolEnforcement { @@ -104,10 +136,30 @@ enum HttpL7Protocol { struct HttpL7Config { protocol: HttpL7Protocol, + path: EndpointPathScope, allow_encoded_slash: bool, + enforcement_mode: L7EnforcementMode, websocket_credential_rewrite: bool, request_body_credential_rewrite: bool, websocket_graphql_policy: bool, + graphql_max_body_bytes: usize, +} + +struct CredentialInjectionPlan { + static_placeholders: StaticPlaceholderPlan, + token_grant: Option, +} + +struct StaticPlaceholderPlan { + http_target_query_header: bool, + rest_request_body: bool, + websocket_text_frames: bool, +} + +struct TokenGrantPlan { + provider_key: String, + auth_style: TokenGrantAuthStyle, + token_endpoint: String, } struct RelayContext { @@ -122,26 +174,19 @@ struct RelayContext { validated destination and lets relays/parsers open an upstream connection only after protocol policy allows it. -## Module Layout - -A future split could look like: - -| Module | Responsibility | -|--------|----------------| -| `proxy::adapter::connect` | Parse CONNECT and render CONNECT responses | -| `proxy::adapter::forward_http` | Parse absolute-form HTTP and preserve first request | -| `proxy::adapter::transparent_tcp` | Recover captured original destination | -| `proxy::adapter::policy_dns` | Answer eligible DNS queries and publish active mappings | -| `proxy::adapter::local` | Implement `inference.local` and `policy.local` surfaces | -| `proxy::auth` | Build decisions from intents and OPA results | -| `proxy::destination` | Resolve, filter, and validate destinations | -| `proxy::netfilter` | Own nftables bypass and future transparent capture rules | -| `proxy::relay::http` | HTTP request loop, credentials, REST/GraphQL/WebSocket upgrade policy | -| `proxy::relay::websocket` | WebSocket frame validation, text-frame rewrite, and message policy | -| `proxy::relay::tcp` | TCP byte relay and TCP application parser dispatch | -| `proxy::relay::tls` | Shared client-side TLS termination | -| `proxy::parser` | HTTP, WebSocket, and TCP application parser traits/config | -| `proxy::telemetry` | OCSF and tracing helpers | +## Current Owners And Proposed Cleanup + +| Current owner | Current responsibility | Proposed cleanup | +|---------------|------------------------|------------------| +| `openshell-sandbox` | Orchestrator, policy poll loop, denial/activity channels, metadata loopback startup, network-only lifecycle | Keep as orchestration; avoid embedding per-entry proxy policy decisions | +| `openshell-supervisor-network::run` | Networking startup and handles | Become the stable runtime API for embedded and future standalone modes | +| `openshell-supervisor-network::proxy` | CONNECT, forward HTTP, local route dispatch, destination validation, denial rendering | Split into adapters, authorization, destination, relay selection, and adapter response rendering | +| `openshell-supervisor-network::opa` | Policy engine and Rego queries | Return deterministic `EgressDecision` data instead of separate policy and endpoint lookups | +| `openshell-supervisor-network::l7` | REST, GraphQL, WebSocket, inference helpers, TLS, token grants | Keep as protocol/relay implementation behind shared relay boundaries | +| `openshell-supervisor-network::policy_local` | `policy.local` state and routes | Model as a local adapter with explicit limits and proposal/wait behavior | +| `openshell-supervisor-process::netns` | nftables bypass rules and namespace helpers | Remain owner of bypass enforcement; coordinate future capture rules with network proxy mappings | +| `openshell-supervisor-process::bypass_monitor` | nftables LOG parsing and OCSF bypass telemetry | Remain telemetry producer for bypass violations | +| `openshell-core::secrets` and provider credential state | Static placeholder sources and dynamic credential metadata | Feed credential injection plans; do not leak secrets into decision logs | ## Policy DNS And Resolved TCP State @@ -194,15 +239,23 @@ Two acceptable approaches: Endpoint metadata query failures should fail closed when metadata is required for the selected endpoint. They should not silently downgrade to L4 behavior. +Provider-derived policies use a reserved rule-name namespace. The gateway and +sandbox sync should prevent user-authored `_provider_*` rules, and +`policy.local` proposal surfaces should not expose provider-derived rules as +editable user policy. `EgressDecision` should still identify provider-derived +matches for logging and debugging. + ## Credential Injection Boundary -Credential injection belongs in the HTTP relay: +Credential injection belongs in the HTTP/WebSocket relay after policy allow and +before upstream write. -1. Authorization selects the endpoint and confirms credentials may be used. -2. The HTTP relay resolves credentials only when it has an allowed HTTP request. -3. Secrets are redacted from logs and policy-visible metadata. -4. The final upstream request or frame is rewritten with real credentials - immediately before write. +1. Authorization selects the endpoint and computes a credential injection plan. +2. The HTTP relay resolves credentials only when it has an allowed request. +3. Static placeholder values are resolved and redacted from logs. +4. Endpoint-bound token grants obtain or reuse a dynamic access token. +5. The final upstream request or WebSocket frame is rewritten immediately + before write. Both L4-only HTTP and HTTP-inspected paths can inject credentials. The difference is whether REST, GraphQL, or WebSocket policy is evaluated before @@ -215,7 +268,8 @@ Credential rewrite slots should be explicit: - client-to-server WebSocket text frames only when `websocket_credential_rewrite` is enabled; - GraphQL-over-WebSocket connection/control messages when they are carried in - text frames and the endpoint enables the WebSocket rewrite path. + text frames and the endpoint enables the WebSocket rewrite path; +- token grant headers for endpoint-bound provider credentials. Request-body rewrite is REST-only. It should buffer bounded UTF-8 textual bodies, including JSON, form-url-encoded, and `text/*`, recompute @@ -223,6 +277,12 @@ bodies, including JSON, form-url-encoded, and `text/*`, recompute credential markers, and fail closed when a reserved placeholder cannot be resolved safely. Binary WebSocket frames are not rewritten. +Token grants are dynamic credential injection. They use provider metadata to +request a SPIFFE JWT-SVID, exchange it for an OAuth2 access token, cache the +token, and inject either an `Authorization: Bearer` header or a configured +custom header. Token grant failures should return a local relay error and must +not forward the request upstream. + ## Parser Boundary Protocol parsers operate on streams owned by the relay. @@ -241,6 +301,24 @@ Protocol parsers operate on streams owned by the relay. This avoids a separate dial strategy enum. The parser knows which protocol milestone is sufficient to call the validated connector. +## Local Service Adapter Boundary + +Local services are network surfaces but not normal external egress: + +- `inference.local` terminates local client traffic, validates known inference + routes, strips caller auth, injects provider routing/auth, and applies + streaming or buffered limits based on route type. +- `policy.local` serves policy snapshots, denial summaries, proposal + submission, and proposal wait. It should never expose secrets or provider + rules as editable policy. +- Metadata loopback serves provider metadata credentials for SDKs that bypass + HTTP proxy variables. It should use the same provider credential state and + redaction discipline as other credential paths. + +These adapters may call gateway APIs or local credential helpers, but they +should not bypass policy and credential invariants that apply to external +egress. + ## Timeout And Resource Ownership | Owner | Resource | @@ -254,6 +332,7 @@ milestone is sufficient to call the validated connector. | TCP relay | Byte-copy idle timeout and half-close handling | | TCP parser | Protocol message timeouts and parser-specific limits | | Local service adapter | Local route body limits, response caps, gateway call timeout | +| Token grant resolver | SPIFFE Workload API timeout, token endpoint timeout, cache TTL | Timeouts should be recorded in telemetry at the owner boundary that can explain the failure.