fix: pass question_date through to answer prompts (temporal questions ran undated) by thameema · Pull Request #46 · supermemoryai/memorybench

thameema · 2026-06-11T02:50:52Z

The bug

Orchestrator.run() initializes each question's checkpoint entry here:

https://github.com/supermemoryai/memorybench/blob/main/src/orchestrator/index.ts#L241

but never copies questionDate from the loaded question into the checkpoint. Everything downstream is already wired for it:

the LongMemEval loader sets metadata.questionDate from the dataset's question_date (src/benchmarks/longmemeval/index.ts:203)
initQuestion accepts and stores metadata.questionDate (src/orchestrator/checkpoint.ts:205)
the answer phase reads checkpoint.questions[id].questionDate and substitutes it into the {{questionDate}} / Question Date: slot of every provider's answer prompt (src/orchestrator/phases/answer.ts:120, src/prompts/defaults.ts:13)

Because the orchestrator drops it at the hand-off, the value is always undefined, and every answer prompt renders Question Date: Not specified — for every provider, on every run.

Impact

Temporal-reasoning questions like "What kitchen appliance did I buy 10 days ago?" are unanswerable without the question date: the model has the dated evidence in context but no anchor to resolve "10 days ago" against. So temporal-reasoning category scores are understated for all providers measured on the harness, and cross-provider comparisons on that category are mostly noise from the model guessing an anchor date.

The fix

One line: pass q.metadata?.questionDate through to initQuestion, matching the signature it already accepts.

How we found it / measured effect

We hit this while evaluating our own memory system on the harness: temporal answers kept resolving relative dates against the wrong anchor despite correct retrieval. On a 20-question LongMemEval pilot (gpt-4o answering + judging), overall accuracy went 60% → 75% from this fix alone, with the gains concentrated in temporal-reasoning questions.

Found while evaluating memnos (github.com/thameema/memnos) on this harness.

… the real date The benchmark loaders (e.g. LongMemEval) put question_date into question.metadata.questionDate, and the answer phase reads checkpoint.questions[id].questionDate to fill the {{questionDate}} / 'Question Date:' slot in every answer prompt. But the orchestrator's initQuestion call never copied questionDate from the loaded question into the checkpoint, so the value was always undefined and every answer prompt rendered 'Question Date: Not specified'. Temporal-reasoning questions ('what did I buy 10 days ago?') are unanswerable without the question date, so temporal category scores were understated for every provider measured on the harness.

thameema force-pushed the fix/question-date-plumbing branch from 058486c to 6dd52c6 Compare June 11, 2026 03:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: pass question_date through to answer prompts (temporal questions ran undated)#46

fix: pass question_date through to answer prompts (temporal questions ran undated)#46
thameema wants to merge 1 commit into
supermemoryai:mainfrom
thameema:fix/question-date-plumbing

thameema commented Jun 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thameema commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The bug

Impact

The fix

How we found it / measured effect

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thameema commented Jun 11, 2026 •

edited

Loading