Skip to content

refactor(encoding): ADR 0001 — all 6 phases#30

Closed
dfa1 wants to merge 2 commits into
mainfrom
worktree-adr-0001-phase0
Closed

refactor(encoding): ADR 0001 — all 6 phases#30
dfa1 wants to merge 2 commits into
mainfrom
worktree-adr-0001-phase0

Conversation

@dfa1

@dfa1 dfa1 commented Jun 11, 2026

Copy link
Copy Markdown
Owner

Summary

Lands every phase of ADR 0001 as 7 commits on one branch. ADR 0001 proposed splitting the read and write runtimes out of core; this PR delivers the structural foundation + exemplars for the per-family lifts.

Commit Phase What
b6437ee 0 ReadRegistry + WriteRegistry delegating facades over Registry.
1750aa1 1 EncodingDecoder + EncodingEncoder interfaces. Encoding becomes a marker that extends both.
ed297cd 2 BoolEncoding decoder lifted into reader/decode/BoolEncodingDecoder. Registry.standaloneDecoders map + SL load. Standalone decoders take precedence over bifunctional fallback.
f21e5ab 3 Mirror: BoolEncoding encoder lifted into writer/encode/BoolEncodingEncoder. Registry.standaloneEncoders map.
71e39c6 4 VortexHandle.decodeFlatSegment(...) typed accessor. ScanIterator.readFlat and VortexInspectorTui dictionary preview migrated off raw slice().
a772b00 5 ExtensionEncoder interface. Extension extends ExtensionEncoder.
9eb2d32 6 docs/compatibility.md documents the read-only deployment artifact subset (./mvnw -pl core,reader,inspector verify is the verified subset).

Tests

./mvnw verify green after every commit. 938 unit + 243 integration tests pass, including the Rust cross-language round-trip suite.

What's left as mechanical follow-up

Phases 2 and 3 each lifted one encoding family (BoolEncoding) as exemplar. The remaining ~29 encoding families follow the same shape:

  • Per family: create *EncodingDecoder in reader/decode/ and *EncodingEncoder in writer/encode/; add lines to the respective META-INF/services/ manifests; remove the corresponding decode() / encode() methods from the core *Encoding class once both sides are lifted; eventually delete the core *Encoding file entirely.

Phase 4's typed accessor migration is partial — 2 of the 6 cross-package slice() callers were migrated. The remaining 4 (ScanIterator.readFlatStats, InspectorTree.peek ×2, VortexInspectorTui hex peek, integration test inspector walk) need their own typed accessors before slice() can be removed from the public VortexHandle interface. Each is a small focused PR.

Phase 5's Extension split is also a foundation — concrete spec-extension encoders (Date/Time/Timestamp/Uuid) are not yet lifted into the writer module. Same mechanical follow-up shape as Phase 2-3.

Test plan

  • ./mvnw verify green after every commit
  • ./mvnw -pl core,reader,inspector verify builds the documented read-only subset
  • Reviewer: confirm the standaloneDecoders.getOrDefault(id, encodings.get(id)) precedence rule
  • Reviewer: agree on the per-family commit shape for the remaining ~29 encoding lifts
  • Reviewer: confirm the decodeFlatSegment typed accessor signature

🤖 Generated with Claude Code

@dfa1 dfa1 changed the title refactor(encoding): ADR 0001 Phases 0–1 foundation refactor(encoding): ADR 0001 Phases 0-3 Jun 11, 2026
@dfa1 dfa1 changed the title refactor(encoding): ADR 0001 Phases 0-3 refactor(encoding): ADR 0001 — all 6 phases Jun 11, 2026
@dfa1

dfa1 commented Jun 11, 2026

Copy link
Copy Markdown
Owner Author

Honest accounting after attempt at deletion cleanup

Tried to do the deletion pass tonight. Reverted because it would have broken ~30 test files that exercise decode logic by directly constructing the bifunctional Encoding class (e.g. new NullEncoding().decode(ctx)).

Current state of PR:

  • 29 standalone *EncodingDecoder in reader/decode/ and 29 *EncodingEncoder in writer/encode/
  • Runtime dispatch via Registry.decode prefers standalone (per Phase 2 commit's standaloneDecoders.getOrDefault(id, encodings.get(id)))
  • 4 unlifted (Alp, AlpRd, Bitpacked, Pco) — their bifunctional *Encoding.java in core/ is hit at runtime
  • Original *Encoding.java files in core/ unchanged — decode() / encode() methods still there for the 29 lifted families. They're dead at runtime (standalone wins) but on disk.

Net effect: PR is pure code addition (~12K LOC). No deletions. Lift is duplication, not migration.

Why the deletion didn't happen:

  1. Encoding extends EncodingDecoder, EncodingEncoder — bifunctional. Removing decode() from the 29 lifted core files requires the interface to drop the decode requirement.
  2. Doable: change to Encoding extends EncodingEncoder only; make 4 unlifted classes implements Encoding, EncodingDecoder explicitly.
  3. But ~30 test files do new NullEncoding().decode(ctx) — direct method call. Removing the method from the class breaks compile.
  4. Real fix: migrate tests to either (a) use standalone decoder via new NullEncodingDecoder() — but that's in reader/ so needs reader as a core/test dependency, or (b) construct a Registry and call registry.decode(ctx) per-test — workable but ~30 file rewrites.

Both paths are multi-hour follow-up work. Stopping here rather than landing half-broken changes.

Recommended next PRs (each independent):

  • Migrate core/src/test decode-side tests off direct new XxxEncoding().decode(ctx) calls. Either move them to reader/src/test (test moves, not just rewrites — about 25 files) or rewrite them to dispatch through Registry.decode.
  • Once tests no longer reference XxxEncoding.decode, bulk-delete the decode() methods + inner Decoder classes from the 29 lifted core files. Same for encode() + Encoder inner classes once consumer fallback chains (MaskedEncoding's INNER_FALLBACK, FixedSizeListEncoding's FALLBACK, etc.) route through WriteRegistry.lookupEncoder instead of new PrimitiveEncoding().
  • Lift the remaining 4 (Alp, AlpRd, Bitpacked, Pco) into the same pattern.
  • Eventually: delete the *Encoding.java files in core entirely.

The structural foundation (interfaces, facades, ServiceLoader manifests, standalone dispatch) is solid. Bulk deletion is mechanical once the test-side dependency is broken.

@dfa1 dfa1 force-pushed the worktree-adr-0001-phase0 branch 6 times, most recently from f354409 to ae5019a Compare June 12, 2026 15:59
dfa1 and others added 2 commits June 12, 2026 18:20
`elementCount` is a final field — JIT hoists it; local alias adds noise.
Spotted via IntelliJ MCP integration (pretty cool tbh).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Separate core's bifunctional encoding model into distinct read and write
runtimes.

**Encoder/decoder lift**
Each encoding gets a standalone EncodingDecoder (reader module) and
EncodingEncoder (writer module). 33 *EncodingTest classes move to
writer/encode or reader/decode per their primary role.

**Phase 0 — Encoding metadata-only**
Encoding interface and all 32 *Encoding stub classes deleted from core.
Shared algorithmic constants (F10 tables, FL_ORDER, FL_CHUNK_SIZE,
dtype constants) and helpers (transposeIndex, iterateIndex, etc.)
inlined as private static into the *EncodingDecoder/*EncodingEncoder
that use them. EncodeContext.encodings (Registry) replaced by
encoders (Map<EncodingId,EncodingEncoder>). CascadingCompressor moves
to writer.encode. Registry becomes extension-only registry.

**Phase 1 — decode types to reader**
DecodeContext, ArrayNode (+subtypes), EncodingDecoder, and
FlatSegmentDecoder move from core.encoding to reader.decode / reader.
ReadRegistry replaces Registry.decode() as the canonical read
dispatcher. VortexReader, VortexHttpReader, VortexHandle, ScanIterator
all take ReadRegistry. Test infra: TestRegistry, TestDecodeContexts,
DecodeTestHelper move to reader/test (new test-jar); writer/test gains
vortex-reader:test-jar dep. ReadRegistryTest replaces the decode subset
of RegistryTest.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dfa1 dfa1 force-pushed the worktree-adr-0001-phase0 branch from c9f036a to 1e7884b Compare June 12, 2026 16:20
@dfa1 dfa1 closed this Jun 12, 2026
@dfa1 dfa1 deleted the worktree-adr-0001-phase0 branch June 12, 2026 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant