interactive: port the data model from flat [i64] to a Value ADT + Term scalar language#760
Draft
frankmcsherry wants to merge 8 commits into
Draft
interactive: port the data model from flat [i64] to a Value ADT + Term scalar language#760frankmcsherry wants to merge 8 commits into
frankmcsherry wants to merge 8 commits into
Conversation
Replace the flat [i64]/FieldExpr data model with the Value ADT (Int/Tuple/
Variant/List) and the Term scalar language, on master-next's scope-tree IR +
substrate-generic backend.
- ir.rs: Value + the tree-walking Term interpreter (eval); LinearOp gains
FlatMap, Filter/EnterAt now carry Term. Drops RowLike/FieldExpr eval and the
arity transfer functions (those were explain-only).
- parse: Projection is now {key: Term, val: Term}; Reducer gains Collect; Expr
gains FlatMap. Both front-ends parse the full Term grammar (tuples/lists/
spread, proj, inject/case, fold, builtins) plus named constructors + pattern
`case` (pipe), reconciled with master-next's import/export syntax.
- backend/vec.rs: Row = Value; render_linear/join/reduce evaluate Terms;
Collect NEST reducer. Value derives serde (ExchangeData bound).
- gen_row produces (Tuple[Int;arity], unit); ddir_vec gains EDGES_FILE input.
Deferred to later stages: explain + its folded helper (need RowModel for
Value/Term), and the col substrate (needs a Columnar story for Value).
Verified: lib tests pass; reach.ddp (root 0, chain 0-1-2-3) -> 4 reachable;
scc.ddp (cycle 0-1-2 + trivial 3-4) -> 3 cycle edges.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Port the Value/ADT example programs onto the scope-tree base (old `result …;` -> `export "result" = …;`), exercising the new scalar language end-to-end: - unnest.ddp — flatmap (UNNEST) / collect (NEST) list round-trip - binders.ddp — fold with named pattern-`case` binders - adt.ddp — named constructors + pattern `case` - congruence.ddp / eqsat.ddp — variable-arity e-node congruence and the full equality-saturation fixpoint - cse_tree.ddp — common-subexpression sharing over expression trees Verified on master-next: eqsat reproduces both scenarios (pure congruence 5~1 then mul(5,2)~mul(1,2); and the a~b cascade collapsing all three muls); unnest round-trips position-ordered; adt yields the same 98/102 buckets. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Re-enable the explanation rewrite on the Value data model by implementing the decoupled `RowModel`/`Dataflow` traits for `Value`/`Term`. - explain/mod.rs: a `Val` RowModel whose demand envelope is a flat value tuple `[V | chain (innermost-first) | q]` — matching the host lift's `append_iter`. Each rule builds `Term`-based projections/predicates over field indices (replacing the flat `[i64]` `FieldExpr` column ranges); `time_le`/`strip` are inlined (the `folded` algebra), and a `Spread`-bounding `expand_value_fields` keeps bare-row refs from pulling in chain coords. `Sb`'s `Dataflow` predicate is now `Term`. The clone/resolve/shape machinery is unchanged; the shape pass is `Term`-arity. - Count now yields a one-field tuple `(count)`, keeping "a value is a tuple" so `$1[0]` and the explain envelope hold uniformly. - decouple.rs: drop the flat executable contract; the `nested_contract` model-agnostic proof remains the runnable spec. `folded.rs` retired. - tests/explain.rs restored, ported to Value rows + the flat query envelope. Verified: all 8 sufficiency tests pass, plus the heavy --ignored sweeps (scc 100/110, the join partner-time regression at 1000/1100, tc/reach fuzz). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- dump_explain: re-enabled (prints the scope-tree IR before/after the rewrite); it has no data-model dependencies and works as-is now that explain is online. - ddir_vec --explain / --query=K:V[,q] / --debug-demand: re-enabled. The query input is seeded with the flat demand envelope `(key ; val ++ q)`; demand collections can be tapped with --debug-demand. The CLI assigns every source the uniform shape (arity, 0), so --explain is for single-input-arity programs (e.g. scc); mixed-arity programs (reach's arity-1 roots) need explicit per-input shapes, as the integration tests use. Verified: scc.ddp --explain demands the cycle edges that produced the queried output. The columnar substrate (ddir_col / backend::col) stays deferred — it needs a Columnar story for Value. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…(stage 5) Restore a unit-level by-example spec for the reverse rules — but over the model the crate actually evaluates. The removed `[i64]` `contract` tested `Flat` via `eval_fields`/`eval_condition`; this `value_contract` runs the same six specs on real `Value` rows in `Val`'s flat envelope `[V | chain | q]`, through an in-memory `Value` dataflow against `explain::Val`. `nested_contract` (a different, nested layout over a toy model) stays as the proof that the *rules* are model-agnostic; `value_contract` pins the *model* the backend runs, closing the unit-coverage gap the deletion opened. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Consolidate the front-end language docs into one reference, on the `pipe` module (the .ddp front-end): the collection language (sources, pipe operators incl. flatmap/collect, statements, `con` decls) and the scalar `Term` language (row/field access, arithmetic, products/lists/sums, named constructors, pattern `case`, `fold` with `^0`/`^1`, binders, `if`). Doc-only; previously this had to be teased out of the `Term` variants, `build_builtin`, and example programs. `Term`'s doc now points here for the concrete syntax. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rses Capture the design for restoring the explainability invariant (writable => explainable) once the data model is Value/ADTs, and for unifying the per-op reverse rules. Core: a universal backstop (witness the inputs, key by output, join on output) makes every op explainable with no op-specific logic; an optional inverse — factored as (PRESERVED_out, PRESERVED_in, RESIDUAL, REFORM) with Total/None as the endpoints and a precision dial between — is a per-op optimization. Maps today's lossy_*/keyed/join/folded onto the interface, notes the opaque-envelope interaction (which also deletes the shape pass), and lays out a phased plan. Doc-only; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tmap Per inverse-design.md, a contract-style test proving the universal backstop reverses `flatmap` — the op the live rewrite currently panics on — using only the existing `Dataflow` primitives over real `Value` rows (with a `List`): a forward-built `(output -> input)` pair table, one join on the output, and a `REFORM` projection that recovers the whole input (the `None` endpoint). It demands one exploded output and gets back exactly the input row whose list carried that element, with the query id. Additive and isolated to the test harness — the live rewrite is untouched. This turns "the regression is closable" into a running demonstration; the phased plan (opaque envelope, refactor lossy_* through the interface, wire FlatMap into the real walk) follows. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replaces the flat [i64]/FieldExpr data model with a Value ADT (Int/Tuple/Variant/List) and a Term scalar language, on top of master-next's scope-tree IR +
substrate-generic backend, and brings the explanation rewrite back online over the new model. Four reviewable commits:
both parsers parse the full Term grammar (tuples/lists/spread, proj, inject/case, fold, builtins, named constructors). backend/vec.rs evaluates Terms over Value rows.
Existing programs verified (reach → 4 reachable; scc → 3 cycle edges).
congruence and the full equality-saturation fixpoint), cse_tree.
lift; time_le/strip are inlined and folded retired. All sufficiency tests pass, including the --ignored sweeps (scc 100/110, the join partner-time regression at
1000/1100, tc/reach fuzz).
Deferred: the columnar substrate (backend::col/ddir_col) needs a Columnar story for Value.
🤖 Generated with Claude Code