docs(rfc): add RFC-0011 multi-player support design by derekwaynecarr · Pull Request #1980 · NVIDIA/OpenShell

derekwaynecarr · 2026-06-23T13:37:49Z

Summary

This RFC introduces multi-player support for OpenShell by adding namespaces as hard isolation boundaries, expanding the role model to five roles (Platform Admin, Namespace Admin, Operator, User, Service
Account), and threading ownership through the sandbox lifecycle. The Kubernetes compute driver gains two namespace mapping modes — managed (default), which creates gateway-scoped Kubernetes namespaces
(openshell-{gateway-id}-{namespace}), and operator mode for 1:1 passthrough to pre-existing namespaces. The design preserves backwards compatibility for single-player support via a default namespace.

Related Issue

#1977

Changes

Namespaces as first-class hard isolation boundaries for sandboxes, providers, and policies, with a default namespace for backwards compatibility

Expanded role model from two-tier (admin/user) to five roles: Platform Admin, Namespace Admin, Operator, User, Service Account
Ownership tracking via created_by on ObjectMeta, with owner-scoped access guards on all sandbox operations
Kubernetes namespace mapping with two modes: managed (default, creates openshell-{gateway-id}-{namespace-name}) and operator (1:1 name passthrough to pre-existing K8s namespaces)
Multi-gateway cluster support via gateway-identifier-scoped Kubernetes namespace naming to avoid collisions
Provider credential scoping to namespaces, with delegation from Namespace Admins to users/service accounts
Policy inheritance where Namespace Admins can tighten (but not loosen) gateway-wide defaults
Multi-provider OIDC with identity federation, plus API key authentication for service accounts
Control-plane audit trail via OCSF ApiActivity events on every mutating gRPC call, with session attribution back to the creating principal
Per-namespace quotas for concurrent sandboxes, GPU allocations, and sandbox lifetime
Cost attribution metadata tagging sandbox consumption with owner, namespace, and labels
Sandbox sharing within namespaces (read-only or exec access) without global visibility

Testing

[x ] mise run pre-commit passes
Unit tests added/updated
E2E tests added/updated (if applicable)

Checklist

[ x] Follows Conventional Commits
[ x] Commits are signed off (DCO)
[ x] Architecture docs updated (if applicable)

copy-pr-bot · 2026-06-23T13:37:52Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

jhjaggars · 2026-06-23T13:57:36Z

+- **Phase 1: Namespace and ownership model.** Add `namespace` and `created_by`
+  fields to `ObjectMeta` in the proto. Implement namespace-scoped storage and
+  filtering in gRPC handlers. Create the `default` namespace for backwards
+  compatibility. Sandbox name uniqueness shifts from globally unique to


Is it critical to implement this in a backward compatible way right now?

If we didn't create a default namespace what would be the single-player UX?

the spirit of the default namespace is that a user never thinks about namespaces at all when using openshell in a single player setup, so the default or some other token is just there to make sure there is no friction in the single player experience by adding multiplayer support.

We could also (automatically) create a namespace per user account in single player mode. This sets us up to have a single gateway for the workstation while supporting different user accounts on that workstation.

I was mostly thinking about the upgrade path from N-1 to N (it's ok to require users to destroy their old setup and start over still). I think I naively assume we'll have some form of authentication for gateway users and can create them a user-associated namespace automatically. This probably works out to be about the same as default though.

jhjaggars · 2026-06-23T14:03:49Z

+
+### Kubernetes Compute Driver: Namespace Mapping
+
+OpenShell namespaces are a logical concept. When the Kubernetes compute driver


I think it'd be useful to outline what behaviors/patterns we hope to enable and control via namespaces. I found myself inventing reasons that it'd be useful to have namespaces, but I think a list of practical applications would be helpful.

the simplest example i have is the friction we hit when doing a team-level gateway setup. within a team, its common for users to have their own dedicated API keys to access claude or codex, and these are private to the individual. that friction leads folks towards wanting a gateway per trust/security domain when a common gateway with some credential segmentation would satisfy. this proposal enables that concept. it also would be safe to now share a sandbox (for connect/exec) actions among users in shared coding sessions, etc. since the literal credentials are left outside the sandbox.

@derekwaynecarr in that scenario, as described Provider Credential Scoping section a namespace admin would have to create and lifecycle manage ever user's credentials in their namespace or would each user have their own namespace and therefore be a namespace admin?

jhjaggars · 2026-06-23T14:06:50Z

+
+- What is the identity mapping strategy for multi-provider OIDC? If a user
+  authenticates via both corporate SSO and GitHub, how are those identities
+  linked to a single internal principal?


Should this even be a thing? Intuitively, this feels like 2 principals to me, or at least I'd not be surprised if it were treated that way. Grant me the same permissions twice and or share with myself (my two principals) feels acceptable.

jhjaggars · 2026-06-23T14:07:21Z

+  authenticates via both corporate SSO and GitHub, how are those identities
+  linked to a single internal principal?
+
+- Should per-namespace quota limits be hard (reject sandbox creation) or soft


Sounds like a reasonable configuration option (to be implemented at any time)

jhjaggars · 2026-06-23T14:09:08Z

+  also be namespace-scoped from the start, or should they remain global and be
+  extended later as the organizational model matures?
+
+- In operator mode, should the driver validate that the target Kubernetes


Fail seems right here. Even if you check, it could go away by the time you try to make it.

jhjaggars · 2026-06-23T14:10:58Z

+|------|-------------|
+| **Platform Admin** | Manages gateway configuration, auth providers, compute drivers, and quotas. Full visibility across all namespaces. |
+| **Namespace Admin** | Manages users, providers, policies, and quotas within a single namespace. Cannot change gateway infra or access other namespaces. |
+| **Operator** | Read-only view of all sandboxes and audit logs across namespaces for monitoring, incident response, and compliance. Cannot create or modify sandboxes. |


I still like the term Auditor for this. Maybe operator means the same in other (kube?) communities?

i like auditor as well.

Will auditor/operator have elevated security privileges? It could be a security concern for Sandboxed applications with sensitive data/credentials.
At the same time, how are they helping enforce compliance?

jhjaggars · 2026-06-23T14:12:20Z

+  currently does not have a durable store beyond configuration files.
+
+- Which resources beyond sandboxes are namespace-scoped? Sandboxes are the
+  primary namespaced resource. Should settings, policies, and provider configs


Earlier in the doc you said this:

Providers belong to a namespace.

Should probably ask the agent to make sure the entire document agrees with itself.

haha, good catch! will update.

Signed-off-by: Derek Carr <decarr@redhat.com>

drew · 2026-06-23T14:48:19Z

Could we rename this to RFC-0011? I'll reserve the number in our tracker 😄.

maxamillion · 2026-06-23T20:46:56Z

+- Gives a clear security boundary (namespace) without over-modeling
+  organizational hierarchy.
+- Allows multiple overlapping groupings within a namespace via labels.
+- Reuses Kubernetes-style patterns that users already understand.


I'm not necessarily against this, but I think assuming users of AI Agents already understand kubernetes design patterns might be a stretch.

maxamillion · 2026-06-23T21:46:39Z

+  unique-within-namespace. Existing sandboxes are backfilled into the `default`
+  namespace. All existing single-player behavior continues unchanged.
+
+- **Phase 2: Kubernetes driver — managed mode (default).** The driver creates


I was thinking about building out a podman driver version of this too. The scenario I have in mind is where an user/student/homelabber who wants to tinker with their team or learn on their own with minimal setup and admin overhead could spin up a linux VM or cloud instance, install the openshell-gateway rpm, run the openshell gateway service, create an openshell namespace for themselves or each member of their team. This obviously won't scale and is a single point of failure, but could be an interesting means to test it out and provides an easy/simple path to "my agents aren't running on my laptop".

Basically one local linux user openshell, one rootless Podman socket, one gateway process, many OpenShell namespaces, one Podman network per OpenShell namespace, one workspace volume per sandbox, no arbitrary bind mounts, namespace-scoped volumes/providers, gateway-enforced RBAC/quotas, OIDC/API-key auth required, gateway enforced quotas, OCSF attribution, and a strict shared-host driver mode that the user would have to opt into.

Or maybe that's a fools errand and the answer is just to show people how to do this with kind, minikube, k3s, or microshift. Thoughts? 🤔

drew · 2026-06-29T17:00:24Z

+- Gives a clear security boundary (namespace) without over-modeling
+  organizational hierarchy.
+- Allows multiple overlapping groupings within a namespace via labels.
+- Reuses Kubernetes-style patterns that users already understand.


Alternatively, namespace might become overloaded and ambiguous w.r.t kubernetes. workspace or project might be a more product-oriented term to use.

Disambiguating also keeps the door open if we want to evolve this unit in a way that doesn't match Kubernetes namespace semantics. For example, maybe a Sandbox can create a temp workspace to spawn multiple subagents in. In that case, we might not want a 1:1 mapping between OpenShell and Kubernetes namespaces.

FWIW I don't have a strong objection to using namespace either, just curious to explore alternatives since naming is hard and this is largely a one way door.

@drew naming is hard, i am happy with workspace as well.

drew · 2026-06-29T19:31:17Z

+their namespace when creating a sandbox. Users cannot see raw credential
+material; they reference providers by name. Namespace Admins grant specific


"Users cannot see raw credential material" should other roles be able to see raw credential materials? Eg a service account role can decrypt credentials to be used by the supervisor.

drew · 2026-06-29T19:33:43Z

+- **Users** can only exec into, delete, or view sandboxes they own within their
+  namespace.


Suggested change

- **Users** can only exec into, delete, or view sandboxes they own within their

namespace.

- **Users** can only create, exec into, delete, or view sandboxes they own within their

namespace.

Is it correct to say users have full CRUD access over all resources that are owned by them?

User should be granted access to Sandboxes using standard K8s resource rbac, correct?

drew · 2026-06-29T19:36:38Z

+A User can share a sandbox with another user within the same namespace
+(read-only or exec access) without making it globally visible. Platform Admins
+can grant targeted cross-namespace access for specific use cases (e.g., a shared
+services namespace).


I'm curious how we represent this in the data model and enable it in the UX. Eg do we want to create explicit share method? Do we capture and store a list of principals each resource is shared with?

Also wondering about transitive resources. For example, if I shared a Sandbox with providers attached, do those transitive providers also become shared?

@drew i had imagined a share method with a list of principals, and likely some type of shared-with-me grpc endpoint to see those explicit sandboxes.

drew · 2026-06-29T20:04:10Z

+### Audit Trail
+
+- **Control-plane audit log.** Every mutating gRPC call (`CreateSandbox`,
+  `DeleteSandbox`, `CreateProvider`, `UpdatePolicy`) emits an OCSF
+  `ConfigStateChange` or `ApiActivity` event with the authenticated principal,
+  action, target resource, and timestamp. Builds on the existing OCSF
+  infrastructure.
+- **Session attribution.** Sandbox activity (network, process, SSH events)
+  tagged with the creating principal's subject, so security teams can trace
+  sandbox behavior back to a human or service account.
+- **Audit log export.** Structured OCSF JSONL shipped to SIEM/log aggregation.
+  Operators can query "who created sandbox X" or "what did user Y do between T1
+  and T2."


This feel somewhat orthogonal to multiplayer-design. I wonder if we could break this out into a separate effort. Does anything related to auditing depend on the outcome of multi-player mode?

drew · 2026-06-29T20:08:08Z

+- **Per-namespace quotas.** Max concurrent sandboxes, max GPU allocations, max
+  sandbox lifetime per namespace. Enforced at the gateway before sandbox
+  creation.


How do you see us representing these values? Maybe as properties on namespace data model?

i actually was debating dropping some of these quota elements, or thinking about them in terms of an out-of-band add-on. I had put the use cases here primarily to drive discussion as we figure out project boundary. In general, I think to protect the openshell control/data-plane, we will need some way to protect against DoS or abuse of a target compute driver, so some type of quota system is potentially useful.

+1 to keeping this an out-of-band add-on rather than core, and to framing it as
DoS/abuse protection rather than chargeback. That's how we run our multi-tenant agent
fleet at DeepInfra: quota lives in the control plane and the compute layer stays a simple
driver.

drew · 2026-06-29T20:12:34Z

+- **Agent orchestration.** One agent's service account creates sandboxes for
+  sub-agents, each getting their own sandbox principal. The parent service
+  account retains visibility.


Does the parent service account create "sub" service accounts?

Does each sandbox have to use a unique sercice account or can it be shared, but still maintain a unique agent identity using the pod name in it's SPIFFE id for example?

dhirajsb · 2026-07-01T07:38:19Z

We are missing a persona in this approach. CMIW, but the user persona as modeled sounds more like an end-user who's directly interacting with an agent in a Sandbox.

OpenShell multi-user architecture should support agentic app development for app developers. It should leverage K8s rbac for access control in app namespaces. It should also decouple policy management through Gateway in such a way that app developers or app workloads (or compromised workloads) can't override certain org/platform wide security policies enforced via Gateway.

I hope that makes sense.

ats3v · 2026-07-01T12:03:16Z

+  Namespace Admin tightens a policy, does it retroactively affect shared
+  sandboxes that were created under the looser policy?
+
+- What is the storage backend for API keys and quota state? The gateway


I think this may already be answered on main. There's a durable object store now (SQLite/Postgres, with optimistic concurrency), and #1577 layered a reconciler lease on top of it for HA. API key and quota state could build on that, so this question might be closable, or narrowed to just what those records should look like.

derekwaynecarr requested review from a team, maxamillion and mrunalp as code owners June 23, 2026 13:37

jhjaggars reviewed Jun 23, 2026

View reviewed changes

docs(rfc): add RFC 1977 multi-player support design

85e9054

Signed-off-by: Derek Carr <decarr@redhat.com>

derekwaynecarr force-pushed the decarr/multi-player-design branch from 3713b9b to 85e9054 Compare June 23, 2026 14:16

drew added this to OpenShell Roadmap Jun 23, 2026

github-project-automation Bot moved this to Todo in OpenShell Roadmap Jun 23, 2026

drew added the rfc label Jun 23, 2026

johntmyers added this to the OpenShell Beta milestone Jun 23, 2026

maxamillion reviewed Jun 23, 2026

View reviewed changes

johntmyers changed the title ~~docs(rfc): add RFC 1977 multi-player support design~~ docs(rfc): add RFC-0011 multi-player support design Jun 25, 2026

rhuss mentioned this pull request Jun 26, 2026

Feat: openshell per-user sandbox ownership e2e tests and docs kagenti/kagenti#1990

Merged

3 tasks

drew reviewed Jun 29, 2026

View reviewed changes

ats3v reviewed Jul 1, 2026

View reviewed changes


		### Kubernetes Compute Driver: Namespace Mapping

		OpenShell namespaces are a logical concept. When the Kubernetes compute driver

		their namespace when creating a sandbox. Users cannot see raw credential
		material; they reference providers by name. Namespace Admins grant specific

		- Users can only exec into, delete, or view sandboxes they own within their
		namespace.

Uh oh!

Conversation

derekwaynecarr commented Jun 23, 2026

Summary

Related Issue

Changes

Testing

Checklist

Uh oh!

copy-pr-bot Bot commented Jun 23, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drew commented Jun 23, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dhirajsb commented Jul 1, 2026

Uh oh!

Choose a reason for hiding this comment