Fix tensor validation gaps in C and Rust APIs by fallintoplace · Pull Request #2238 · rapidsai/cuvs

fallintoplace · 2026-06-11T16:52:02Z

Summary

Keep cuvsMatrixSliceRows from publishing partially initialized output metadata on validation failures, with C API coverage for invalid ranges and ranks.
Validate cuvsPairwiseDistance dtypes against the actual x, y, and dist tensors, including the supported float16 input to float32 output case.
Make Rust ManagedTensor carry the ndarray borrow lifetime, own its DLPack shape metadata, reject non-standard ndarray layouts, and validate to_host() destination shape/dtype before copying.

Why

The C API paths were trusting or exposing metadata too early in a few validation cases. On the Rust side, ManagedTensor looked owning but stored borrowed ndarray data and shape pointers without a lifetime, which allowed safe Rust to build tensors with dangling metadata. to_host() also reused source tensor metadata for the destination, bypassing the C copy shape and dtype checks for the actual output buffer.

Validation

pre-commit run clang-format --files c/src/distance/pairwise_distance.cpp c/tests/distance/pairwise_distance_c.cu c/include/cuvs/distance/pairwise_distance.h
cargo fmt --all
cargo check -p cuvs --features doc-only --all-targets
git diff --check

I also tried cargo test -p cuvs --features doc-only dlpack -- --nocapture, but doc-only test binaries still need the cuVS C symbols at link time on this macOS ARM machine. CUDA C tests were not run locally because nvcc and nvidia-smi are not available here.

copy-pr-bot · 2026-06-11T16:52:06Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-06-11T16:56:49Z

📝 Walkthrough

Walkthrough

Two independent change sets: (1) C API hardening and dtype validation with added tests for matrix row-slicing and pairwise distance; (2) Rust refactor introducing lifetime-parameterized ManagedTensor, to_device/to_host fixes, dlpack byte-size and validation updates, and propagation of lifetime changes across indices, examples, docs, and tests.

Changes

Matrix Slice Rows API Hardening

Layer / File(s)	Summary
API contract documentation `c/include/cuvs/core/c_api.h`	Reword `cuvsMatrixSliceRows` `end` parameter to be half-open: "one past the last row index to include".
Implementation: validation & shape handling `c/src/core/c_api.cpp`	Add null checks, enforce src.ndim == 1 or 2, validate src.shape/data, tighten slice bounds `0 <= start <= end <= src.shape[0]`, reset dst fields, and switch shape/strides allocation to unique_ptr with explicit 1D/2D handling and ownership transfer.
Tests: matrix slice rows `c/tests/core/c_api.c`	Add helpers and `test_matrix_slice_rows` that verify correct 1D/2D slicing behavior and failure cases (out-of-range, 0D); wire test into main.

Pairwise distance dtype validation and tests

Layer / File(s)	Summary
Docs: dtype requirements `c/include/cuvs/distance/pairwise_distance.h`	Document dtype constraints: `x`/`y` must share floating-point dtype; `dist` must be float32 for float16 inputs, otherwise match input dtype.
Implementation: dtype checks `c/src/distance/pairwise_distance.cpp`	Fix dtype extraction for y/dist tensors; require single-lane floating dtypes; enforce x and y bit-width equality; require float32 output for float16 inputs or matching input bits otherwise; update error messages.
Tests: C/CUDA tests `c/tests/distance/pairwise_distance_c.cu`	Add helpers to allocate DLManagedTensor wrappers and gtests asserting error messages for dtype mismatches and success path for float16→float32 distances (with stream sync).

Rust ManagedTensor lifetime refactor & API updates

Layer / File(s)	Summary
ManagedTensor / DLPack refactor `rust/cuvs/src/dlpack.rs`	Refactor `ManagedTensor<'a>` to hold owned shape and PhantomData, add `from_ndarray`/TryFrom, refactor `to_device`/`to_host`, correct dl_tensor_bytes to use dtype.bits and lanes, and add validation helpers and tests for layout/shape/dtype.
Index APIs lifetime-parameterized `rust/cuvs/src/brute_force.rs`, `rust/cuvs/src/cagra/index.rs`, `rust/cuvs/src/ivf_flat/index.rs`, `rust/cuvs/src/ivf_pq/index.rs`, `rust/cuvs/src/vamana/index.rs`, `rust/cuvs/src/distance/mod.rs`	Make Index and public API functions accept/retain `ManagedTensor<'a>` or `&ManagedTensor<'_>`, update generic bounds and return types (e.g., `Index<'a>`), and adjust Drop impls accordingly.
Tests, examples, and docs updates `rust/cuvs/examples/`, `rust/cuvs/src//mod.rs`, `rust//tests/`	Update examples, docs, and unit tests to construct host tensors via `ManagedTensor::from_ndarray(...).to_device(&res)` and to match the new lifetime-aware signatures; minor test wiring adjustments (to_owned, non-mutable hosts, etc.).

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels: improvement, non-breaking

Suggested reviewers:

robertmaynard
divyegala
tfeher

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Fix tensor validation gaps in C and Rust APIs' clearly and specifically summarizes the primary change—improving validation in both C and Rust tensor APIs.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description clearly explains what changes were made, why they matter, and how they were validated.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

c/tests/core/c_api.c (1)

71-73: ⚡ Quick win

Add a start > end negative test case.

Line 71-Line 73 cover two invalid ranges, but not the explicit start > end branch now enforced in cuvsMatrixSliceRows. Add one assertion to lock that contract path.

Suggested test addition

   expect_matrix_slice_error(res, &src_2d, -1, 1);
   expect_matrix_slice_error(res, &src_2d, 0, 4);
+  expect_matrix_slice_error(res, &src_2d, 2, 1);

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@c/tests/core/c_api.c` around lines 71 - 73, Add a negative test for the start
> end case by invoking expect_matrix_slice_error with the same result matrix and
source (&src_2d) but with start greater than end (e.g., start=2, end=1) so the
test asserts the new contract in cuvsMatrixSliceRows; place this new assertion
alongside the existing invalid-range checks near expect_matrix_slice_error(res,
&src_2d, -1, 1) and expect_matrix_slice_error(res, &src_2d, 0, 4) to cover the
explicit start > end branch.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@c/tests/core/c_api.c`:
- Around line 71-73: Add a negative test for the start > end case by invoking
expect_matrix_slice_error with the same result matrix and source (&src_2d) but
with start greater than end (e.g., start=2, end=1) so the test asserts the new
contract in cuvsMatrixSliceRows; place this new assertion alongside the existing
invalid-range checks near expect_matrix_slice_error(res, &src_2d, -1, 1) and
expect_matrix_slice_error(res, &src_2d, 0, 4) to cover the explicit start > end
branch.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 86ba6ebd-3d99-42e6-9852-03a57d734c8a

📥 Commits

Reviewing files that changed from the base of the PR and between 8e9a78c and c727525.

📒 Files selected for processing (3)

c/include/cuvs/core/c_api.h
c/src/core/c_api.cpp
c/tests/core/c_api.c

coderabbitai

🧹 Nitpick comments (2)

c/tests/distance/pairwise_distance_c.cu (1)
134-141: 💤 Low value

Consider a more specific error substring for clarity.

The test uses float32 inputs but checks for an error substring mentioning "float16 inputs". While the full error message does contain this substring (the complete message is "...for float16 inputs and match the input dtype otherwise"), checking for a substring like "match the input dtype" would more clearly reflect the actual failure mode for this test case.
📝 Suggested alternative substring
   expect_pairwise_distance_error_contains(
     float_dtype(32),
     float_dtype(32),
     float_dtype(64),
-    "distances output to cuvsPairwiseDistance must have dtype float32 for float16 inputs");
+    "match the input dtype");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@c/tests/distance/pairwise_distance_c.cu` around lines 134 - 141, The test
PairwiseDistanceC::FailsWithMismatchedFloatOutputDtype is asserting an error
message that references "float16 inputs" even though the inputs used are
float32; update the expected substring in
expect_pairwise_distance_error_contains to a more accurate and specific phrase
such as "match the input dtype" (or "must match the input dtype") so the
assertion reflects the actual failure mode for float_dtype(32) inputs and
mismatched float_dtype(64) output; locate the call to
expect_pairwise_distance_error_contains in the TEST and replace the current
substring accordingly.
rust/cuvs/src/dlpack.rs (1)
212-218: 💤 Low value

Panic risk in rmm_free_tensor deleter callback.

Resources::new().unwrap() can panic if resource creation fails. Since this runs inside Drop, a panic here could cause double-panic aborts. Consider handling the error gracefully (e.g., logging and continuing) or caching the Resources handle.

Note: This appears to be pre-existing behavior, so addressing it could be deferred.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cuvs/src/dlpack.rs` around lines 212 - 218, rmm_free_tensor currently
calls Resources::new().unwrap() inside the deleter which can panic during Drop;
change this to avoid panicking by replacing the unwrap with fallible handling
(e.g., call Resources::new().map(|res| { let bytes =
dl_tensor_bytes(&(*self_).dl_tensor); let _ = ffi::cuvsRMMFree(res.0,
(*self_).dl_tensor.data as *mut _, bytes); }).unwrap_or_else(|err| { log the
error via your logger and skip the free })) or alternatively cache a Resources
handle for reuse so the deleter never creates resources; ensure you reference
rmm_free_tensor, Resources::new(), dl_tensor_bytes, and ffi::cuvsRMMFree when
making the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@c/tests/distance/pairwise_distance_c.cu`:
- Around line 134-141: The test
PairwiseDistanceC::FailsWithMismatchedFloatOutputDtype is asserting an error
message that references "float16 inputs" even though the inputs used are
float32; update the expected substring in
expect_pairwise_distance_error_contains to a more accurate and specific phrase
such as "match the input dtype" (or "must match the input dtype") so the
assertion reflects the actual failure mode for float_dtype(32) inputs and
mismatched float_dtype(64) output; locate the call to
expect_pairwise_distance_error_contains in the TEST and replace the current
substring accordingly.

In `@rust/cuvs/src/dlpack.rs`:
- Around line 212-218: rmm_free_tensor currently calls Resources::new().unwrap()
inside the deleter which can panic during Drop; change this to avoid panicking
by replacing the unwrap with fallible handling (e.g., call
Resources::new().map(|res| { let bytes = dl_tensor_bytes(&(*self_).dl_tensor);
let _ = ffi::cuvsRMMFree(res.0, (*self_).dl_tensor.data as *mut _, bytes);
}).unwrap_or_else(|err| { log the error via your logger and skip the free })) or
alternatively cache a Resources handle for reuse so the deleter never creates
resources; ensure you reference rmm_free_tensor, Resources::new(),
dl_tensor_bytes, and ffi::cuvsRMMFree when making the change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b9c71ae5-b00c-4f30-9d47-c12f03247074

📥 Commits

Reviewing files that changed from the base of the PR and between c727525 and 4d1d3e4.

📒 Files selected for processing (15)

c/include/cuvs/distance/pairwise_distance.h
c/src/distance/pairwise_distance.cpp
c/tests/distance/pairwise_distance_c.cu
rust/cuvs/examples/cagra.rs
rust/cuvs/src/brute_force.rs
rust/cuvs/src/cagra/index.rs
rust/cuvs/src/cagra/mod.rs
rust/cuvs/src/cluster/kmeans/mod.rs
rust/cuvs/src/distance/mod.rs
rust/cuvs/src/dlpack.rs
rust/cuvs/src/ivf_flat/index.rs
rust/cuvs/src/ivf_flat/mod.rs
rust/cuvs/src/ivf_pq/index.rs
rust/cuvs/src/ivf_pq/mod.rs
rust/cuvs/src/vamana/index.rs

✅ Files skipped from review due to trivial changes (2)

c/include/cuvs/distance/pairwise_distance.h
rust/cuvs/src/ivf_pq/mod.rs

Fix matrix row slice validation

c727525

fallintoplace requested a review from a team as a code owner June 11, 2026 16:52

github-project-automation Bot added this to Unstructured Data Processing Jun 11, 2026

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

Fix DLPack and distance dtype validation

4d1d3e4

fallintoplace requested a review from a team as a code owner June 11, 2026 18:09

fallintoplace changed the title ~~Fix C matrix row slice validation~~ Fix tensor validation gaps in C and Rust APIs Jun 11, 2026

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tensor validation gaps in C and Rust APIs#2238

Fix tensor validation gaps in C and Rust APIs#2238
fallintoplace wants to merge 2 commits into
rapidsai:mainfrom
fallintoplace:fix-matrix-slice-rows-validation

fallintoplace commented Jun 11, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fallintoplace commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Validation

Uh oh!

copy-pr-bot Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fallintoplace commented Jun 11, 2026 •

edited

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading