Skip to content

Discriminate WG21 index refresh failures and propagate stale signal t…#76

Merged
wpak-ai merged 8 commits into
developfrom
feat/index-refresh-error-discrimination
Jun 24, 2026
Merged

Discriminate WG21 index refresh failures and propagate stale signal t…#76
wpak-ai merged 8 commits into
developfrom
feat/index-refresh-error-discrimination

Conversation

@henry0816191

@henry0816191 henry0816191 commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add IndexRefreshError and IndexRefreshResult so WG21Index.refresh() / fetch() distinguish transient download failures (TIMEOUT, RATE_LIMIT, NETWORK) from permanent ones (CONFIGURATION).
  • Return IndexRefreshResult(papers, stale=True) on stale cache fallback instead of silently returning an empty or in-memory dict.
  • Raise ConfigurationError when no index data is available (e.g. non-dict JSON with no cache).
  • Update Scheduler to unwrap IndexRefreshResult, log INDEX-STALE, skip advancing _last_successful_poll on stale index, and handle IndexRefreshError in run_forever() (retry next cycle).

Test plan

  • pytest — full suite (392 passed)
  • ruff check / ruff format --check
  • mypy on protocols, monitor, __main__
  • New unit tests: timeout/429/stale/configuration error paths in test_sources.py
  • New integration tests: error propagation and run_forever() behavior in test_monitor.py

Related Issue

Summary by CodeRabbit

  • Bug Fixes

    • Improved WG21 index refresh/fetch reliability with clearer, category-based handling for timeouts and rate limits (retry-eligible), while keeping configuration errors fatal.
    • Ensures stale cached index data is consistently used when transient failures occur, with updated “index stale” and failure-category logging.
    • Refined scheduler poll-success timing so “last successful poll” advancement is suppressed appropriately when the index is stale.
  • Tests

    • Expanded coverage for timeout-to-category mapping, stale-cache fallback, and correct behavior in both continue-poll vs fatal-stop scenarios, including expected log markers.

@henry0816191 henry0816191 self-assigned this Jun 23, 2026
@henry0816191 henry0816191 requested a review from wpak-ai as a code owner June 23, 2026 18:38
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds typed WG21 refresh outcomes and categorized refresh failures, updates scheduler staleness handling, and expands protocol and integration tests for the new return shapes and error paths.

Changes

WG21 Index Refresh Error Discrimination

Layer / File(s) Summary
New error and result contracts
src/paperscout/errors.py, src/paperscout/models.py
IndexRefreshError carries a FailureCategory, and IndexRefreshResult carries papers plus a stale flag.
WG21 refresh and download outcomes
src/paperscout/sources.py
WG21Index.refresh() returns IndexRefreshResult, falls back to stale cached data when available, and raises typed failures when no usable data remains; _download() now returns (data, category) with explicit category mapping, and fetch() returns the new result type.
Scheduler staleness tracking and refresh errors
src/paperscout/monitor.py
Scheduler tracks _index_stale across polls, reads IndexRefreshResult.stale from WG21 refreshes, suppresses last-successful-poll advancement when stale, and handles IndexRefreshError separately in run_forever().
Protocol mocks return refresh results
tests/test_callback_protocols.py, tests/test_datasource_protocol.py
WG21 mocks return IndexRefreshResult instead of bare dictionaries.
WG21 source tests for categories and stale results
tests/test_sources.py
refresh() tests assert IndexRefreshResult, stale-cache behavior, tuple download returns, and the mapped failure categories for timeout, network, rate limit, and configuration paths.
Scheduler integration tests for WG21 refresh errors
tests/test_monitor.py
The scheduler test setup now uses IndexRefreshResult, adds a real WG21Index path, and covers timeout propagation, stale-cache polling, continued retries, and configuration-fatal shutdown.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

bug

Suggested reviewers

  • wpak-ai

Poem

🐇 I hopped through cache and network fog,
and tagged each fall with rabbit log.
If stale appears, I still can see,
a stale=True leaf on the tree.
Timeout, rate, and config too —
each hop now knows just what to do.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.37% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and accurately describes the main change: discriminating WG21 refresh failures and propagating stale state.
Description check ✅ Passed The description matches the template with Summary, Test plan, and Related issues, and it includes the core implementation and test details.
Linked Issues check ✅ Passed The changes satisfy #72 by discriminating refresh failures, signaling stale fallback, raising ConfigurationError, and adding the required tests.
Out of Scope Changes check ✅ Passed The refactors and test updates support the refresh/error-handling goals and do not appear unrelated to the linked issue.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/index-refresh-error-discrimination

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/paperscout/sources.py`:
- Around line 99-107: The code currently accepts any dict as a successful
response, including empty dicts which should be rejected. In the
isinstance(data, dict) check, add an additional condition to verify the dict is
not empty before returning it as successful. If the data is an empty dict, treat
it the same as a non-dict response by logging a warning with
FailureCategory.CONFIGURATION and returning None, FailureCategory.CONFIGURATION
instead, to prevent empty payloads from overwriting the cache.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7e6c6db1-416a-4e62-a98f-ca48d5cb20cc

📥 Commits

Reviewing files that changed from the base of the PR and between be4f507 and 3009c86.

📒 Files selected for processing (8)
  • src/paperscout/errors.py
  • src/paperscout/models.py
  • src/paperscout/monitor.py
  • src/paperscout/sources.py
  • tests/test_callback_protocols.py
  • tests/test_datasource_protocol.py
  • tests/test_monitor.py
  • tests/test_sources.py

Comment thread src/paperscout/sources.py
@henry0816191

Copy link
Copy Markdown
Collaborator Author

@coderabbitai full review again

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

@henry0816191 Sure! I'll kick off a full review of the PR now.

✅ Action performed

Full review finished.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
tests/test_sources.py (1)

264-272: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add a malformed-JSON regression for the terminal config path.

The no-data test covers non-dict JSON, but not resp.json() raising ValueError; that branch determines whether no-cache refresh becomes ConfigurationError or retryable IndexRefreshError.

🧪 Proposed test
     async def test_refresh_no_data_raises_configuration_error(self, fake_pool):
         index = WG21Index(fake_pool)
         mock_resp = _make_response(200, json_data=[1, 2, 3])
         mock_client = _make_async_client(get_resp=mock_resp)
         with patch("paperscout.sources.httpx.AsyncClient") as mock_cls:
             mock_cls.return_value.__aenter__ = AsyncMock(return_value=mock_client)
             mock_cls.return_value.__aexit__ = AsyncMock(return_value=False)
             with pytest.raises(ConfigurationError, match="No index data available"):
                 await index.refresh()
+
+    async def test_refresh_malformed_json_raises_configuration_error(self, fake_pool):
+        index = WG21Index(fake_pool)
+        mock_resp = _make_response(200, json_data=SAMPLE_INDEX_DATA)
+        mock_resp.json = MagicMock(side_effect=ValueError("invalid json"))
+        mock_client = _make_async_client(get_resp=mock_resp)
+        with patch("paperscout.sources.httpx.AsyncClient") as mock_cls:
+            mock_cls.return_value.__aenter__ = AsyncMock(return_value=mock_client)
+            mock_cls.return_value.__aexit__ = AsyncMock(return_value=False)
+            with pytest.raises(ConfigurationError, match="No index data available"):
+                await index.refresh()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_sources.py` around lines 264 - 272, Add a regression test for
malformed JSON handling in the WG21Index refresh flow. Currently,
test_refresh_no_data_raises_configuration_error covers non-dict JSON responses,
but not the case where resp.json() raises a ValueError due to malformed JSON.
Create a new test (or extend the existing one) that mocks the async response
object to raise a ValueError when .json() is called (using AsyncMock with
side_effect), then verify that index.refresh() still raises ConfigurationError
with an appropriate error message. This ensures the error path that determines
whether no-cache refresh becomes ConfigurationError versus retryable
IndexRefreshError is properly covered.
tests/test_monitor.py (1)

812-827: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Cover stale fallback after the scheduler is already seeded.

This test currently hits the cold-start seed early return, so it does not exercise the normal poll completion path where _advance_staleness_clock_if_ok() runs after diffing and notifications. Add a second stale poll or pre-seed the scheduler to cover that path.

Example extension
         assert "P2300R10" in wg21.papers
         assert "INDEX-STALE" in caplog.text
         assert scheduler._last_successful_poll is None
         prober.fetch.assert_called_once()
+
+        prober.fetch.reset_mock()
+        with patch("paperscout.sources.httpx.AsyncClient") as mock_cls:
+            mock_cls.return_value.__aenter__ = AsyncMock(return_value=mock_client)
+            mock_cls.return_value.__aexit__ = AsyncMock(return_value=False)
+            await scheduler.poll_once()
+        assert scheduler._last_successful_poll is None
+        prober.fetch.assert_called_once()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_monitor.py` around lines 812 - 827, The
test_poll_once_uses_stale_index test currently exits early during the cold-start
seed phase, so it does not exercise the normal poll completion path where
_advance_staleness_clock_if_ok() runs. To cover this path, add a second await
scheduler.poll_once() call after the first one (before the assertions) to ensure
the scheduler is already seeded and initialized, so the second poll attempt will
go through the full completion path instead of taking the cold-start early
return.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/paperscout/sources.py`:
- Around line 59-63: The code checks if cached data is not None but doesn't
validate that it contains actual entries. An empty dictionary from
`_cache.read_if_fresh()` will pass the `is not None` check and result in
`_parse_and_index(cached)` returning empty papers while still returning
`stale=False`, which incorrectly advances downstream staleness clocks. Modify
the condition that checks `if cached is not None` to also verify that cached is
not empty (has length greater than zero) before proceeding with
`_parse_and_index(cached)` and returning the IndexRefreshResult with
stale=False. This fix needs to be applied in both locations mentioned: the
cached read_if_fresh path and the also applies section around line 72-79.
- Around line 136-143: The except clause in the WG21 index fetching logic
currently catches both httpx.HTTPError and ValueError together and classifies
both as NETWORK failures, but ValueError from malformed JSON should be treated
as a CONFIGURATION failure instead. Split the exception handling into two
separate except clauses: keep httpx.HTTPError with FailureCategory.NETWORK
classification, and add a new except clause specifically for ValueError that
logs and returns FailureCategory.CONFIGURATION to properly surface parse
failures as terminal configuration errors rather than retryable network errors.

In `@tests/test_monitor.py`:
- Around line 872-874: The test for the fatal-path in scheduler.run_forever()
lacks a shutdown mechanism to prevent hanging if the expected ConfigurationError
path regresses. Replace the AsyncMock() for the patched asyncio.sleep with a
mock that raises an exception instead of returning immediately. This ensures
that if the code path ever reaches the sleep call (indicating the
ConfigurationError was not properly raised), the test will fail immediately with
an exception rather than entering an infinite loop.

---

Nitpick comments:
In `@tests/test_monitor.py`:
- Around line 812-827: The test_poll_once_uses_stale_index test currently exits
early during the cold-start seed phase, so it does not exercise the normal poll
completion path where _advance_staleness_clock_if_ok() runs. To cover this path,
add a second await scheduler.poll_once() call after the first one (before the
assertions) to ensure the scheduler is already seeded and initialized, so the
second poll attempt will go through the full completion path instead of taking
the cold-start early return.

In `@tests/test_sources.py`:
- Around line 264-272: Add a regression test for malformed JSON handling in the
WG21Index refresh flow. Currently,
test_refresh_no_data_raises_configuration_error covers non-dict JSON responses,
but not the case where resp.json() raises a ValueError due to malformed JSON.
Create a new test (or extend the existing one) that mocks the async response
object to raise a ValueError when .json() is called (using AsyncMock with
side_effect), then verify that index.refresh() still raises ConfigurationError
with an appropriate error message. This ensures the error path that determines
whether no-cache refresh becomes ConfigurationError versus retryable
IndexRefreshError is properly covered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6ed944b6-b15b-4e7a-8401-9bf606446a8b

📥 Commits

Reviewing files that changed from the base of the PR and between be4f507 and fb2e835.

📒 Files selected for processing (8)
  • src/paperscout/errors.py
  • src/paperscout/models.py
  • src/paperscout/monitor.py
  • src/paperscout/sources.py
  • tests/test_callback_protocols.py
  • tests/test_datasource_protocol.py
  • tests/test_monitor.py
  • tests/test_sources.py

Comment thread src/paperscout/sources.py Outdated
Comment thread src/paperscout/sources.py Outdated
Comment thread tests/test_monitor.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/paperscout/sources.py`:
- Around line 60-70: The cache/download success path in `PaperScoutSource` can
still yield zero parsed papers because `_parse_and_index()` filters out non-dict
entries, so validate the parsed result before returning `IndexRefreshResult(...,
stale=False)` or calling `_cache.write()`. In `sources.py`, use the `cached` and
`raw` handling around `_download()`, `_parse_and_index()`, and `_cache.write()`
to only mark success and overwrite the persisted cache when the parsed paper set
is non-empty; otherwise keep the existing cache and treat the payload as
unusable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1e034380-434c-46cf-8961-e206df405c49

📥 Commits

Reviewing files that changed from the base of the PR and between fb2e835 and bb04d77.

📒 Files selected for processing (3)
  • src/paperscout/sources.py
  • tests/test_monitor.py
  • tests/test_sources.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/test_monitor.py
  • tests/test_sources.py

Comment thread src/paperscout/sources.py Outdated
…lid entry dicts, so len(cached) > 0 / if raw: was not enough to guarantee usable index data.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/paperscout/sources.py`:
- Around line 61-84: The current refresh flow in sources.py lets
_parse_and_index() mutate index metadata while testing cached/downloaded
payloads, which can leave self.papers out of sync if a candidate payload is
rejected or raises. Update the refresh logic around _parse_and_index(),
_download(), and the stale-cache fallback so parsing happens against local
temporary state first, then commit self.papers and the derived frontier metadata
only after a payload is accepted. Keep _max_rev, _max_p, and _sorted_p_nums
unchanged unless the new payload is successfully selected.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0122bac0-43f0-4669-9817-9c1a2bb71602

📥 Commits

Reviewing files that changed from the base of the PR and between bb04d77 and afc6ef2.

📒 Files selected for processing (2)
  • src/paperscout/sources.py
  • tests/test_sources.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_sources.py

Comment thread src/paperscout/sources.py Outdated
…utated _max_rev, _max_p, and _sorted_p_nums before acceptance — leaving frontier metadata out of sync with self.papers when a payload was rejected.
- Added a second await scheduler.poll_once() after resetting the ISO prober mock, asserting _last_successful_poll stays None on the incremental poll path.
- Added test_refresh_malformed_json_raises_configuration_error asserting refresh() raises ConfigurationError("No index data available") when resp.json() raises ValueError with no cache.
@wpak-ai wpak-ai merged commit f0766ed into develop Jun 24, 2026
10 checks passed
@wpak-ai wpak-ai deleted the feat/index-refresh-error-discrimination branch June 24, 2026 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WG21Index.refresh() error discrimination (5pt)

2 participants