Skip to content

Feat/index stats query params#1237

Open
vivek378521 wants to merge 5 commits into
meilisearch:mainfrom
vivek378521:feat/index-stats-query-params
Open

Feat/index stats query params#1237
vivek378521 wants to merge 5 commits into
meilisearch:mainfrom
vivek378521:feat/index-stats-query-params

Conversation

@vivek378521

@vivek378521 vivek378521 commented May 31, 2026

Copy link
Copy Markdown

Pull Request

Related issue

Fixes #1234

What does this PR do?

Meilisearch v1.44.0 adds two optional query parameters to GET /indexes/{indexUid}/stats and GET /stats:

  • showInternalDatabaseSizes — when true, index stat objects include an internalDatabaseSizes dictionary
  • sizeFormat"human" for readable sizes (e.g. "19.64 MiB") or "raw" for byte counts (default)

This PR wires those parameters through the Python SDK:

  • Index.get_stats() — keyword-only show_internal_database_sizes and size_format
  • Client.get_all_stats() — same parameters
  • IndexStats — optional internal_database_sizes: Dict[str, Any] (loosely typed; keys may change between Meilisearch versions, per API guidance)
  • SizeFormat enum (raw / human) for optional typing on size_format
  • Integration tests for both endpoints (raw vs human sizes, combined params)
  • .code-samples.meilisearch.yaml — updated get_index_stats_1 and get_indexes_stats_1

Query params are only sent when the caller provides them (backward compatible). Boolean values are encoded as lowercase true/false.

Example:

stats = client.index("movies").get_stats(
    show_internal_database_sizes=True,
    size_format="human",
)

PR checklist

Please check if your PR fulfills the following requirements:

  • Did you use any AI tool while implementing this PR (code, tests, docs, etc.)? If yes, disclose it in the PR description and describe what it was used for. AI usage is allowed when it is disclosed.
    • Yes. Cursor (AI-assisted IDE) was used for implementation guidance, test structure, PR description drafting, and debugging (e.g. boolean query encoding). All changes were reviewed and committed by the author.
  • Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
  • Have you read the contributing guidelines?
  • Have you made sure that the title is accurate and descriptive of the changes?

Overview

Adds support for two optional query parameters introduced in Meilisearch v1.44.0 to the Python SDK’s index and global stats endpoints:

  • show_internal_database_sizes: When true, includes an internalDatabaseSizes dictionary showing internal database names and sizes (keys intentionally kept loosely typed due to API variability)
  • size_format: Controls output format—"human" for human-readable units (e.g., "19.64 MiB") or "raw" for byte counts (default)

Changes

New Type Definition (meilisearch/models/index.py)

  • Added SizeFormat enum with RAW = "raw" and HUMAN = "human"
  • Extended IndexStats with optional internal_database_sizes: Optional[Dict[str, Any]] = None

API Updates

  • Index.get_stats(): Added keyword-only parameters show_internal_database_sizes: Optional[bool] = None and size_format: Optional[Union[SizeFormat, str]] = None
  • Client.get_all_stats(): Added the same keyword-only parameters
  • Both endpoints now conditionally include query parameters only when provided (preserving backward compatibility)
  • Boolean query parameters are encoded as lowercase true/false
  • size_format supports both SizeFormat enum values and raw string values

Test Coverage

  • tests/index/test_index_stats_meilisearch.py
    • Verifies internal_database_sizes is present and non-empty when show_internal_database_sizes=True
    • Verifies human-readable formatting when combined with size_format=SizeFormat.HUMAN (validated via HUMAN_SIZE_PATTERN)
    • Verifies behavior when size_format="human" is passed as a string
  • tests/client/test_client_stats_meilisearch.py
    • Adds coverage for global stats (get_all_stats) for:
      • show_internal_database_sizes=True with raw sizes
      • show_internal_database_sizes=True with size_format=SizeFormat.HUMAN
      • show_internal_database_sizes=True with size_format="human"

Documentation

  • Updated .code-samples.meilisearch.yaml to reflect the new parameters in get_index_stats_1 and get_indexes_stats_1, demonstrating show_internal_database_sizes=True with size_format='human'

Support showInternalDatabaseSizes and sizeFormat on index stats,
extend IndexStats with internal_database_sizes, and add integration tests.
Support showInternalDatabaseSizes and sizeFormat on global stats,
with integration tests and updated code sample.
@coderabbitai

coderabbitai Bot commented May 31, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a724a853-36cd-4c91-9a97-564f861eb9be

📥 Commits

Reviewing files that changed from the base of the PR and between 701755a and 2ffdff2.

📒 Files selected for processing (2)
  • tests/client/test_client_stats_meilisearch.py
  • tests/settings/test_settings_embedders.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/client/test_client_stats_meilisearch.py

📝 Walkthrough

Walkthrough

This PR extends the Meilisearch Python SDK's stats APIs: Client.get_all_stats() and Index.get_stats() now accept optional show_internal_database_sizes and size_format parameters, adds a SizeFormat enum and an internal_database_sizes field to IndexStats, updates documentation code samples, and adds comprehensive tests for the new behavior. Additionally, embedder configuration update tests are adjusted with longer timeouts to accommodate increased processing requirements.

Changes

Stats API Query Parameters Extension

Layer / File(s) Summary
Data Model & SizeFormat Enum
meilisearch/models/index.py
Introduces SizeFormat enum with RAW and HUMAN values, and extends IndexStats with optional internal_database_sizes field to hold database-level size breakdowns.
Client.get_all_stats() API Extension
meilisearch/client.py
Imports SizeFormat and updates get_all_stats() to accept optional keyword-only parameters show_internal_database_sizes and size_format (enum or string), building and appending them as query parameters to the stats endpoint request.
Index.get_stats() API Extension
meilisearch/index.py
Imports SizeFormat and updates get_stats() to accept optional keyword-only parameters show_internal_database_sizes and size_format (enum or string), with identical query parameter construction logic mirroring the client method.
Index API Test Coverage
tests/index/test_index_stats_meilisearch.py
Adds HUMAN_SIZE_PATTERN regex and three new tests validating get_stats() with internal database sizes, human-readable format via SizeFormat.HUMAN, and mixed string format parameter modes.
Client API Test Coverage
tests/client/test_client_stats_meilisearch.py
Adds HUMAN_SIZE_PATTERN regex and three new tests validating get_all_stats() with internal database sizes, human-readable format via SizeFormat.HUMAN, and mixed string format parameter modes.
Documentation Code Samples
.code-samples.meilisearch.yaml
Updates get_index_stats_1 and get_indexes_stats_1 examples to demonstrate calling the stats methods with show_internal_database_sizes=True and size_format='human' parameters.

Test Infrastructure Improvements

Layer / File(s) Summary
Embedder Configuration Test Timeouts
tests/settings/test_settings_embedders.py
Increase timeout in HuggingFace and composite embedder format tests from default to explicit 120 seconds (timeout_in_ms=120000) when waiting for embedder update tasks to complete.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Poem

🐰 I hopped through stats to find each size,
human-readable numbers and tiny surprise,
RAW or HUMAN, the choice now is near,
internal DB sizes for each index appear,
cheers — the stats are clearer this year!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Feat/index stats query params' is vague and doesn't clearly convey the specific changes to callers. Consider using a more descriptive title like 'Add support for size format and internal DB sizes in stats endpoints' to better communicate the feature.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed All coding requirements from issue #1234 are fully implemented: both Index.get_stats() and Client.get_all_stats() accept the new parameters, IndexStats includes internal_database_sizes field, SizeFormat enum is provided, comprehensive tests are added, and code samples are updated.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing the stats query parameters feature. Only two files with minimal unrelated changes (test_settings_embedders.py) appear to be part of the PR, and those timeout adjustments align with the stated implementation details.
Docstring Coverage ✅ Passed Docstring coverage is 93.33% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch feat/index-stats-query-params

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/client/test_client_stats_meilisearch.py (1)

39-54: ⚡ Quick win

The internal-sizes human-format check can silently no-op.

Unlike test_get_all_stats_with_internal_database_sizes, this test has no any(...) guard asserting that at least one index actually contains internalDatabaseSizes. If no index returns the key, the loop body never runs and the human-format assertion on internal sizes is skipped (only the top-level databaseSize check on Line 48 would cover the format). Add a presence guard to ensure the internal-size branch is exercised.

✅ Proposed guard
     assert isinstance(response["databaseSize"], str)
     assert HUMAN_SIZE_PATTERN.match(response["databaseSize"])
+    assert any(
+        "internalDatabaseSizes" in index_stats for index_stats in response["indexes"].values()
+    )
     for index_stats in response["indexes"].values():
         if "internalDatabaseSizes" in index_stats:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/client/test_client_stats_meilisearch.py` around lines 39 - 54, The test
test_get_all_stats_with_size_format can silently skip the internalDatabaseSizes
checks if no index contains that key; update the test to assert the branch is
exercised by collecting presence from response (e.g., use any(...) on
response["indexes"].values() to check for "internalDatabaseSizes") and assert
that at least one index contains internalDatabaseSizes before running the loop
that validates HUMAN_SIZE_PATTERN on those values; operate on the existing
response variable and keep the databaseSize string checks, then add the presence
guard to fail the test if no internal sizes are present.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/client/test_client_stats_meilisearch.py`:
- Around line 39-54: The test test_get_all_stats_with_size_format can silently
skip the internalDatabaseSizes checks if no index contains that key; update the
test to assert the branch is exercised by collecting presence from response
(e.g., use any(...) on response["indexes"].values() to check for
"internalDatabaseSizes") and assert that at least one index contains
internalDatabaseSizes before running the loop that validates HUMAN_SIZE_PATTERN
on those values; operate on the existing response variable and keep the
databaseSize string checks, then add the presence guard to fail the test if no
internal sizes are present.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e585e2a7-2f74-4d44-95cb-53d88495ad07

📥 Commits

Reviewing files that changed from the base of the PR and between ada25db and f2b7d58.

📒 Files selected for processing (6)
  • .code-samples.meilisearch.yaml
  • meilisearch/client.py
  • meilisearch/index.py
  • meilisearch/models/index.py
  • tests/client/test_client_stats_meilisearch.py
  • tests/index/test_index_stats_meilisearch.py

Address CodeRabbit feedback: fail test_get_all_stats_with_size_format if
no index returns internalDatabaseSizes, so human-size checks cannot pass silently.

@Strift Strift left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @vivek378521, thanks for your PR!

This looks good, but CI is not passing. I suggest you run tests + formatting locally before requesting another review 🙏

@vivek378521

Copy link
Copy Markdown
Author

Hello @Strift

I ran all tests locally and now everything looks good. Please review.

Thank you for the time. I really appreciate it. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Meilisearch v1.44.0] Add human-formatted sizes and detailed DB sizes in stats

2 participants