feat(EC-1816): add multi-component stress benchmark by dheerajodha · Pull Request #3331 · conforma/cli

dheerajodha · 2026-06-04T13:28:32Z

Adds a stress benchmark under benchmark/stress/ that validates a multi-component snapshot with 35 workers, simulating the workload that caused the OOM incident (EC-1805)
Component count (EC_STRESS_COMPONENTS, default 10) and worker count (EC_STRESS_WORKERS, default 35) are parameterized via env vars for CI tuning
Reuses existing benchmark infrastructure (benchmark/internal/suite, registry, untar) and the same golden-container image data, duplicated across components at runtime

coderabbitai · 2026-06-04T13:28:41Z

📝 Walkthrough

Walkthrough

Three new files are added under benchmark/stress/: a Bash script to prepare data.tar.gz by pulling from Quay or regenerating it locally, a Bash script to push that archive to Quay as an OCI artifact, and a Go benchmark program that consumes the archive to run concurrent validate image policy evaluations using the golang.org/x/benchmarks/driver.

Changes

Stress Benchmark Infrastructure

Layer / File(s)	Summary
Data archive preparation and push scripts `benchmark/stress/prepare_data.sh`, `benchmark/stress/push_data.sh`	`prepare_data.sh` tries `oras pull` for the prebuilt `stress-v1` archive from Quay and falls back to regenerating it via `go run benchmark/offliner` plus a cloned git policy dependency; `push_data.sh` validates `data.tar.gz` exists and uploads it to `quay.io/conforma/benchmark-data:stress-v1` via `oras push`. Both use strict shell error handling.
Benchmark wiring, env config, workspace setup, and snapshot building `benchmark/stress/stress.go`	Registers the `Stress` benchmark with the driver; `envInt` parses `EC_STRESS_COMPONENTS` and `EC_STRESS_WORKERS` with defaults and minimum-value validation; `setup` untars `data.tar.gz`, starts a local registry from the extracted data, and returns a cleanup closure; snapshot structs and `buildSnapshot(n)` generate N identical components with fixed image and git source metadata as JSON.
Policy execution and parallel run `benchmark/stress/stress.go`	`ec(dir, components, workers)` constructs a JSON policy string and returns a `suite.Execute` closure with `validate`/`image` subcommands, snapshot, policy, worker count, and a fixed effective time; `run` wraps that closure in `driver.Parallel` for concurrent benchmark iterations.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding a stress benchmark for multiple components, which aligns with the changeset.
Description check	✅ Passed	The description is directly related to the changeset, providing context about the stress benchmark's purpose, parameterized components, worker counts, and reuse of existing infrastructure.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

fullsend-ai-review · 2026-06-04T13:37:27Z

Review

Findings

Medium

[stale-reference] benchmark/stress/stress.go:97 — The git source URL for golden-container uses the old organization name enterprise-contract (https://github.com/enterprise-contract/golden-container.git) while the rest of the codebase (simple benchmark, hack scripts) has migrated to https://github.com/conforma/golden-container. The git revision 8327c1ce7472b017b9396fe26d5d5e1ed0eb61cc also differs from the simple benchmark's 2dec8f515a64ef2f21ee3e7b1ed41da77a5c5a9a, suggesting it may reference a commit in the old repo that could become unavailable if the old repo is archived.
Remediation: Use https://github.com/conforma/golden-container to match the existing simple benchmark pattern, and verify the revision hash exists in the conforma fork.

Low

[edge-case] benchmark/stress/stress.go:42 — The envInt function panics on values < 1 but does not guard against unreasonably large values. For EC_STRESS_COMPONENTS, an extremely large value would cause buildSnapshot to allocate a massive slice, likely causing an OOM before the benchmark runs. Minor robustness concern since this is a developer tool.
[incomplete-doc] benchmark/README.md — The benchmark README describes benchmarks generically but doesn't mention the new stress benchmark, its env vars (EC_STRESS_COMPONENTS, EC_STRESS_WORKERS), or its distinct purpose of simulating multi-component workloads.

Info

[pattern-violation] benchmark/stress/stress.go:119 — The policy JSON string in the stress benchmark is well-formed JSON (no trailing commas), while the simple benchmark's policy string contains trailing commas (invalid JSON). The stress benchmark is more correct here, but the inconsistency between benchmarks is notable.

Previous run

Review

Findings

Low

[resource-leak] benchmark/stress/stress.go:68 — In setup(), if registry.Launch() fails, the temporary directory created by untar.UnTar() is never cleaned up. The panic exits without removing the temp dir. This matches the existing pattern in benchmark/simple/simple.go and the OS reclaims the directory on process exit, so practical risk is minimal.
Remediation: Call os.RemoveAll(dir) before panicking on registry.Launch failure, or defer cleanup unconditionally.
[missing-input-validation] benchmark/stress/stress.go:52 — envInt accepts zero and negative values for EC_STRESS_COMPONENTS and EC_STRESS_WORKERS. Setting these to zero or negative values could produce confusing benchmark results.
Remediation: Add a check that the returned value is at least 1.

Info

[sub-agent-failure] The style-conventions, intent-coherence, and docs-currency sub-agents did not return findings: model claude-sonnet-4-5@20250929 not available on the deployment. These are sonnet-tier dimensions; correctness (opus) completed successfully.

Previous run (2)

Review

Findings

Low

[edge-case] benchmark/stress/stress.go:56 — envInt does not validate that the returned integer is positive. Setting EC_STRESS_COMPONENTS=0 produces a snapshot with zero components, and EC_STRESS_WORKERS=0 or a negative value is passed directly to --workers. These degenerate inputs silently produce meaningless benchmark results rather than failing fast. Consider adding a lower-bound check (e.g., if n < 1 { panic(...) }).

Info

[design-direction] benchmark/stress/ — The stress benchmark is well-placed in the benchmark infrastructure alongside the existing simple benchmark. Both use golang.org/x/benchmarks/driver for performance measurement. If the goal evolves toward OOM reproduction/regression testing specifically, acceptance tests could complement this benchmark.
[sub-agent-gap] The style-conventions sub-agent could not access PR branch files. Manual inspection confirms the stress benchmark follows the established patterns from benchmark/simple/: identical setup()/Closer lifecycle, same driver.Parallel(n, 1, fn) shape, consistent license headers, and proper reuse of benchmark/internal/ packages.

fullsend-ai-review · 2026-06-04T13:37:29Z

+	}
+	n, err := strconv.Atoi(v)
+	if err != nil {
+		panic(fmt.Sprintf("invalid %s value %q: %v", name, v, err))


[low] edge-case

envInt does not validate that the returned integer is positive. Setting EC_STRESS_COMPONENTS=0 produces a snapshot with zero components, and EC_STRESS_WORKERS=0 or a negative value is passed directly to --workers. These degenerate inputs silently produce meaningless benchmark results rather than failing fast.

Suggested fix: Add a lower-bound check after parsing: if n < 1 { panic(fmt.Sprintf("%s must be >= 1, got %d", name, n)) }

already addressed.

codecov · 2026-06-04T13:40:00Z

Codecov Report

❌ Patch coverage is 0% with 80 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
benchmark/stress/stress.go	0.00%	80 Missing ⚠️

Flag	Coverage Δ
acceptance	`53.56% <ø> (-2.04%)`	⬇️
generative	`16.76% <0.00%> (-1.06%)`	⬇️
integration	`27.62% <0.00%> (+1.06%)`	⬆️
unit	`68.61% <0.00%> (-0.44%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
benchmark/stress/stress.go	`0.00% <0.00%> (ø)`

... and 13 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add a stress benchmark under benchmark/stress/ that validates a multi-component snapshot with configurable worker count, simulating real-world release pipeline workloads that caused OOM (EC-1805). - Component count controlled via EC_STRESS_COMPONENTS (default 10) - Worker count controlled via EC_STRESS_WORKERS (default 35) - Uses the same golden-container image as the simple benchmark, duplicated across components at runtime - Reuses the existing benchmark/internal/suite harness - Includes prepare_data.sh to regenerate offline data archive - Automatically supported by make benchmark_stress via Makefile wildcard rules Resolves: EC-1816 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fullsend-ai-review · 2026-06-11T12:58:45Z

🤖 Finished Review · ✅ Success · Started 12:58 PM UTC · Completed 1:06 PM UTC
Commit: 47d3320 · View workflow run →

fullsend-ai-review · 2026-06-11T13:06:33Z

+	}
+
+	closer, err := registry.Launch(path.Join(dir, "data/registry/data"))
+	if err != nil {


[low] resource-leak

In setup(), if registry.Launch() fails, the temporary directory created by untar.UnTar() is never cleaned up. The panic exits without removing the temp dir. This matches the existing pattern in benchmark/simple/simple.go and the OS reclaims the directory on process exit, so practical risk is minimal.

Suggested fix: Call os.RemoveAll(dir) before panicking on registry.Launch failure, or defer cleanup unconditionally.

Since the current code matches the current setup in simple.go file, and the OS reclaims the dir on process exit, let's not apply this fix.

fullsend-ai-review · 2026-06-11T13:06:33Z

+func envInt(name string, fallback int) int {
+	v, ok := os.LookupEnv(name)
+	if !ok {
+		return fallback


[low] missing-input-validation

envInt accepts zero and negative values for EC_STRESS_COMPONENTS and EC_STRESS_WORKERS. Setting these to zero or negative values could produce confusing benchmark results.

Suggested fix: Add a check that the returned value is at least 1.

good point, addressed.

Pull pre-built data.tar.gz from quay.io/conforma/benchmark-data in prepare_data.sh, falling back to upstream regeneration. Add push_data.sh for uploading the archive. Resolves: EC-1816 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fullsend-ai-review · 2026-06-16T12:25:46Z

🤖 Review · Started 12:25 PM UTC
Commit: 47d3320 · View workflow run →

Reject zero and negative values for EC_STRESS_COMPONENTS and EC_STRESS_WORKERS to fail fast instead of producing meaningless benchmark results. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fullsend-ai-review · 2026-06-16T12:34:15Z

🤖 Finished Review · ✅ Success · Started 12:34 PM UTC · Completed 12:44 PM UTC
Commit: 47d3320 · View workflow run →

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@benchmark/stress/prepare_data.sh`:
- Around line 30-36: The oras pull command suppresses error output with
2>/dev/null and the script always falls back to regenerating from upstream on
failure, making CI runs non-reproducible and hiding infrastructure issues.
Remove the error suppression (2>/dev/null) from the oras pull command on line 30
and restructure the logic so that if the oras pull fails, the script exits with
an error rather than continuing to the regeneration fallback. This ensures
benchmark input remains deterministic and surfaces any Quay or authentication
failures instead of silently working around them.

In `@benchmark/stress/stress.go`:
- Around line 26-38: The imports in the stress.go file are not properly ordered
according to the gci formatting standards. Run the project's Go import
formatting tool (typically gci write or go fmt) on the stress.go file to
automatically reorder the imports into the correct grouping: standard library
imports first, followed by blank line, then third-party imports (like
golang.org/x/benchmarks), followed by blank line, then local package imports
(like github.com/conforma/cli). This will resolve the gci formatting check
failure.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 03e32d5c-c048-4271-92dd-eba0be016eaa

📥 Commits

Reviewing files that changed from the base of the PR and between c6df9ad and aa42a5a.

📒 Files selected for processing (3)

benchmark/stress/prepare_data.sh
benchmark/stress/push_data.sh
benchmark/stress/stress.go

coderabbitai · 2026-06-16T12:41:50Z

+if command -v oras &>/dev/null && oras pull "${quay_ref}" -o . 2>/dev/null; then
+    echo "Downloaded data.tar.gz from ${quay_ref}"
+    exit 0
+fi
+
+echo "Quay pull failed or oras not available, regenerating from upstream..."
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail closed on Quay pull failures to keep benchmark input deterministic

Line 30 suppresses pull errors and Line 35 always regenerates from upstream on any failure. That makes CI runs non-reproducible and can hide artifact-hosting/auth outages behind a “successful” local rebuild.

Suggested change

-if command -v oras &>/dev/null && oras pull "${quay_ref}" -o . 2>/dev/null; then +if command -v oras &>/dev/null && oras pull "${quay_ref}" -o .; then echo "Downloaded data.tar.gz from ${quay_ref}" exit 0 fi -echo "Quay pull failed or oras not available, regenerating from upstream..." +if [[ "${EC_STRESS_ALLOW_REGEN:-0}" != "1" ]]; then + echo "Failed to pull ${quay_ref}. Set EC_STRESS_ALLOW_REGEN=1 to regenerate from upstream." >&2 + exit 1 +fi +echo "Quay pull failed or oras not available, regenerating from upstream (EC_STRESS_ALLOW_REGEN=1)..."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if command -v oras &>/dev/null && oras pull "${quay_ref}" -o . 2>/dev/null; then

echo "Downloaded data.tar.gz from ${quay_ref}"

exit 0

fi

echo "Quay pull failed or oras not available, regenerating from upstream..."

if command -v oras &>/dev/null && oras pull "${quay_ref}" -o .; then

echo "Downloaded data.tar.gz from ${quay_ref}"

exit 0

fi

if [[ "${EC_STRESS_ALLOW_REGEN:-0}" != "1" ]]; then

echo "Failed to pull ${quay_ref}. Set EC_STRESS_ALLOW_REGEN=1 to regenerate from upstream." >&2

exit 1

fi

echo "Quay pull failed or oras not available, regenerating from upstream (EC_STRESS_ALLOW_REGEN=1)..."

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@benchmark/stress/prepare_data.sh` around lines 30 - 36, The oras pull command suppresses error output with 2>/dev/null and the script always falls back to regenerating from upstream on failure, making CI runs non-reproducible and hiding infrastructure issues. Remove the error suppression (2>/dev/null) from the oras pull command on line 30 and restructure the logic so that if the oras pull fails, the script exits with an error rather than continuing to the regeneration fallback. This ensures benchmark input remains deterministic and surfaces any Quay or authentication failures instead of silently working around them.

coderabbitai · 2026-06-16T12:41:50Z

+import (
+	"encoding/json"
+	"fmt"
+	"os"
+	"path"
+	"strconv"
+
+	"golang.org/x/benchmarks/driver"
+
+	"github.com/conforma/cli/benchmark/internal/registry"
+	"github.com/conforma/cli/benchmark/internal/suite"
+	"github.com/conforma/cli/benchmark/internal/untar"
+)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

stress.go is currently failing gci formatting checks

Static analysis reports a gci formatting error (reported at Line 82). Please run the repo’s Go formatting/import-order step for this file to unblock lint/CI.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@benchmark/stress/stress.go` around lines 26 - 38, The imports in the stress.go file are not properly ordered according to the gci formatting standards. Run the project's Go import formatting tool (typically gci write or go fmt) on the stress.go file to automatically reorder the imports into the correct grouping: standard library imports first, followed by blank line, then third-party imports (like golang.org/x/benchmarks), followed by blank line, then local package imports (like github.com/conforma/cli). This will resolve the gci formatting check failure.

Source: Linters/SAST tools

fullsend-ai-review

See the review comment for full details.

fullsend-ai-review · 2026-06-16T12:44:16Z

+}
+
+type snapshot struct {
+	Components []component `json:"components"`


[medium] stale-reference

The git source URL for golden-container uses the old organization name enterprise-contract (https://github.com/enterprise-contract/golden-container.git) while the rest of the codebase has migrated to https://github.com/conforma/golden-container. The git revision also differs from the simple benchmark, suggesting it may reference a commit in the old repo.

Suggested fix: Use https://github.com/conforma/golden-container to match the existing simple benchmark pattern, and verify the revision hash exists in the conforma fork.

fullsend-ai-review · 2026-06-16T12:44:16Z

+
+const (
+	defaultComponents = 10
+	defaultWorkers    = 35


[low] edge-case

The envInt function panics on values < 1 but does not guard against unreasonably large values. An extremely large EC_STRESS_COMPONENTS value would cause buildSnapshot to allocate a massive slice, likely causing OOM.

fullsend-ai-review · 2026-06-16T12:44:16Z

+	if err != nil {
+		panic(err)
+	}
+	return string(data)


[info] pattern-violation

The policy JSON string uses well-formed JSON (no trailing commas) while the simple benchmark has trailing commas. The stress benchmark is more correct, but the inconsistency is notable.

github-actions Bot added the size: L label Jun 4, 2026

fullsend-ai-review Bot approved these changes Jun 4, 2026

View reviewed changes

fullsend-ai-review Bot added the ready-for-merge All reviewers approved — ready to merge label Jun 4, 2026

dheerajodha force-pushed the EC-1816 branch from 6325eb6 to 0e56368 Compare June 11, 2026 12:57

fullsend-ai-review Bot approved these changes Jun 11, 2026

View reviewed changes

fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge and removed ready-for-merge All reviewers approved — ready to merge labels Jun 11, 2026

github-actions Bot added size: XL and removed size: L labels Jun 16, 2026

fix(EC-1816): validate envInt returns positive value

aa42a5a

Reject zero and negative values for EC_STRESS_COMPONENTS and EC_STRESS_WORKERS to fail fast instead of producing meaningless benchmark results. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

dheerajodha marked this pull request as ready for review June 16, 2026 12:32

coderabbitai Bot reviewed Jun 16, 2026

View reviewed changes

fullsend-ai-review Bot reviewed Jun 16, 2026

View reviewed changes

fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed ready-for-merge All reviewers approved — ready to merge labels Jun 16, 2026

Conversation

dheerajodha commented Jun 4, 2026

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

fullsend-ai-review Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review

Findings

Medium

Low

Info

Review

Findings

Low

Info

Review

Findings

Low

Info

Uh oh!

fullsend-ai-review Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

dheerajodha Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

fullsend-ai-review Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fullsend-ai-review Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

dheerajodha Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

fullsend-ai-review Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

dheerajodha Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

fullsend-ai-review Bot commented Jun 16, 2026

Uh oh!

fullsend-ai-review Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

fullsend-ai-review Bot left a comment

Choose a reason for hiding this comment

Uh oh!

fullsend-ai-review Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

fullsend-ai-review Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

fullsend-ai-review Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

fullsend-ai-review Bot commented Jun 4, 2026 •

edited

Loading

codecov Bot commented Jun 4, 2026 •

edited

Loading

fullsend-ai-review Bot commented Jun 11, 2026 •

edited

Loading

fullsend-ai-review Bot commented Jun 16, 2026 •

edited

Loading