Skip to content

Fix tensor validation gaps in C and Rust APIs#2238

Open
fallintoplace wants to merge 2 commits into
rapidsai:mainfrom
fallintoplace:fix-matrix-slice-rows-validation
Open

Fix tensor validation gaps in C and Rust APIs#2238
fallintoplace wants to merge 2 commits into
rapidsai:mainfrom
fallintoplace:fix-matrix-slice-rows-validation

Conversation

@fallintoplace

@fallintoplace fallintoplace commented Jun 11, 2026

Copy link
Copy Markdown

Summary

  • Keep cuvsMatrixSliceRows from publishing partially initialized output metadata on validation failures, with C API coverage for invalid ranges and ranks.
  • Validate cuvsPairwiseDistance dtypes against the actual x, y, and dist tensors, including the supported float16 input to float32 output case.
  • Make Rust ManagedTensor carry the ndarray borrow lifetime, own its DLPack shape metadata, reject non-standard ndarray layouts, and validate to_host() destination shape/dtype before copying.

Why

The C API paths were trusting or exposing metadata too early in a few validation cases. On the Rust side, ManagedTensor looked owning but stored borrowed ndarray data and shape pointers without a lifetime, which allowed safe Rust to build tensors with dangling metadata. to_host() also reused source tensor metadata for the destination, bypassing the C copy shape and dtype checks for the actual output buffer.

Validation

  • pre-commit run clang-format --files c/src/distance/pairwise_distance.cpp c/tests/distance/pairwise_distance_c.cu c/include/cuvs/distance/pairwise_distance.h
  • cargo fmt --all
  • cargo check -p cuvs --features doc-only --all-targets
  • git diff --check

I also tried cargo test -p cuvs --features doc-only dlpack -- --nocapture, but doc-only test binaries still need the cuVS C symbols at link time on this macOS ARM machine. CUDA C tests were not run locally because nvcc and nvidia-smi are not available here.

@copy-pr-bot

copy-pr-bot Bot commented Jun 11, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Two independent change sets: (1) C API hardening and dtype validation with added tests for matrix row-slicing and pairwise distance; (2) Rust refactor introducing lifetime-parameterized ManagedTensor, to_device/to_host fixes, dlpack byte-size and validation updates, and propagation of lifetime changes across indices, examples, docs, and tests.

Changes

Matrix Slice Rows API Hardening

Layer / File(s) Summary
API contract documentation
c/include/cuvs/core/c_api.h
Reword cuvsMatrixSliceRows end parameter to be half-open: "one past the last row index to include".
Implementation: validation & shape handling
c/src/core/c_api.cpp
Add null checks, enforce src.ndim == 1 or 2, validate src.shape/data, tighten slice bounds 0 <= start <= end <= src.shape[0], reset dst fields, and switch shape/strides allocation to unique_ptr with explicit 1D/2D handling and ownership transfer.
Tests: matrix slice rows
c/tests/core/c_api.c
Add helpers and test_matrix_slice_rows that verify correct 1D/2D slicing behavior and failure cases (out-of-range, 0D); wire test into main.

Pairwise distance dtype validation and tests

Layer / File(s) Summary
Docs: dtype requirements
c/include/cuvs/distance/pairwise_distance.h
Document dtype constraints: x/y must share floating-point dtype; dist must be float32 for float16 inputs, otherwise match input dtype.
Implementation: dtype checks
c/src/distance/pairwise_distance.cpp
Fix dtype extraction for y/dist tensors; require single-lane floating dtypes; enforce x and y bit-width equality; require float32 output for float16 inputs or matching input bits otherwise; update error messages.
Tests: C/CUDA tests
c/tests/distance/pairwise_distance_c.cu
Add helpers to allocate DLManagedTensor wrappers and gtests asserting error messages for dtype mismatches and success path for float16→float32 distances (with stream sync).

Rust ManagedTensor lifetime refactor & API updates

Layer / File(s) Summary
ManagedTensor / DLPack refactor
rust/cuvs/src/dlpack.rs
Refactor ManagedTensor<'a> to hold owned shape and PhantomData, add from_ndarray/TryFrom, refactor to_device/to_host, correct dl_tensor_bytes to use dtype.bits and lanes, and add validation helpers and tests for layout/shape/dtype.
Index APIs lifetime-parameterized
rust/cuvs/src/brute_force.rs, rust/cuvs/src/cagra/index.rs, rust/cuvs/src/ivf_flat/index.rs, rust/cuvs/src/ivf_pq/index.rs, rust/cuvs/src/vamana/index.rs, rust/cuvs/src/distance/mod.rs
Make Index and public API functions accept/retain ManagedTensor<'a> or &ManagedTensor<'_>, update generic bounds and return types (e.g., Index<'a>), and adjust Drop impls accordingly.
Tests, examples, and docs updates
rust/cuvs/examples/*, rust/cuvs/src/*/mod.rs, rust/*/tests/*
Update examples, docs, and unit tests to construct host tensors via ManagedTensor::from_ndarray(...).to_device(&res) and to match the new lifetime-aware signatures; minor test wiring adjustments (to_owned, non-mutable hosts, etc.).

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels: improvement, non-breaking

Suggested reviewers:

  • robertmaynard
  • divyegala
  • tfeher
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix tensor validation gaps in C and Rust APIs' clearly and specifically summarizes the primary change—improving validation in both C and Rust tensor APIs.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description clearly explains what changes were made, why they matter, and how they were validated.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
c/tests/core/c_api.c (1)

71-73: ⚡ Quick win

Add a start > end negative test case.

Line 71-Line 73 cover two invalid ranges, but not the explicit start > end branch now enforced in cuvsMatrixSliceRows. Add one assertion to lock that contract path.

Suggested test addition
   expect_matrix_slice_error(res, &src_2d, -1, 1);
   expect_matrix_slice_error(res, &src_2d, 0, 4);
+  expect_matrix_slice_error(res, &src_2d, 2, 1);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@c/tests/core/c_api.c` around lines 71 - 73, Add a negative test for the start
> end case by invoking expect_matrix_slice_error with the same result matrix and
source (&src_2d) but with start greater than end (e.g., start=2, end=1) so the
test asserts the new contract in cuvsMatrixSliceRows; place this new assertion
alongside the existing invalid-range checks near expect_matrix_slice_error(res,
&src_2d, -1, 1) and expect_matrix_slice_error(res, &src_2d, 0, 4) to cover the
explicit start > end branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@c/tests/core/c_api.c`:
- Around line 71-73: Add a negative test for the start > end case by invoking
expect_matrix_slice_error with the same result matrix and source (&src_2d) but
with start greater than end (e.g., start=2, end=1) so the test asserts the new
contract in cuvsMatrixSliceRows; place this new assertion alongside the existing
invalid-range checks near expect_matrix_slice_error(res, &src_2d, -1, 1) and
expect_matrix_slice_error(res, &src_2d, 0, 4) to cover the explicit start > end
branch.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 86ba6ebd-3d99-42e6-9852-03a57d734c8a

📥 Commits

Reviewing files that changed from the base of the PR and between 8e9a78c and c727525.

📒 Files selected for processing (3)
  • c/include/cuvs/core/c_api.h
  • c/src/core/c_api.cpp
  • c/tests/core/c_api.c

@fallintoplace fallintoplace requested a review from a team as a code owner June 11, 2026 18:09
@fallintoplace fallintoplace changed the title Fix C matrix row slice validation Fix tensor validation gaps in C and Rust APIs Jun 11, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
c/tests/distance/pairwise_distance_c.cu (1)

134-141: 💤 Low value

Consider a more specific error substring for clarity.

The test uses float32 inputs but checks for an error substring mentioning "float16 inputs". While the full error message does contain this substring (the complete message is "...for float16 inputs and match the input dtype otherwise"), checking for a substring like "match the input dtype" would more clearly reflect the actual failure mode for this test case.

📝 Suggested alternative substring
   expect_pairwise_distance_error_contains(
     float_dtype(32),
     float_dtype(32),
     float_dtype(64),
-    "distances output to cuvsPairwiseDistance must have dtype float32 for float16 inputs");
+    "match the input dtype");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@c/tests/distance/pairwise_distance_c.cu` around lines 134 - 141, The test
PairwiseDistanceC::FailsWithMismatchedFloatOutputDtype is asserting an error
message that references "float16 inputs" even though the inputs used are
float32; update the expected substring in
expect_pairwise_distance_error_contains to a more accurate and specific phrase
such as "match the input dtype" (or "must match the input dtype") so the
assertion reflects the actual failure mode for float_dtype(32) inputs and
mismatched float_dtype(64) output; locate the call to
expect_pairwise_distance_error_contains in the TEST and replace the current
substring accordingly.
rust/cuvs/src/dlpack.rs (1)

212-218: 💤 Low value

Panic risk in rmm_free_tensor deleter callback.

Resources::new().unwrap() can panic if resource creation fails. Since this runs inside Drop, a panic here could cause double-panic aborts. Consider handling the error gracefully (e.g., logging and continuing) or caching the Resources handle.

Note: This appears to be pre-existing behavior, so addressing it could be deferred.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cuvs/src/dlpack.rs` around lines 212 - 218, rmm_free_tensor currently
calls Resources::new().unwrap() inside the deleter which can panic during Drop;
change this to avoid panicking by replacing the unwrap with fallible handling
(e.g., call Resources::new().map(|res| { let bytes =
dl_tensor_bytes(&(*self_).dl_tensor); let _ = ffi::cuvsRMMFree(res.0,
(*self_).dl_tensor.data as *mut _, bytes); }).unwrap_or_else(|err| { log the
error via your logger and skip the free })) or alternatively cache a Resources
handle for reuse so the deleter never creates resources; ensure you reference
rmm_free_tensor, Resources::new(), dl_tensor_bytes, and ffi::cuvsRMMFree when
making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@c/tests/distance/pairwise_distance_c.cu`:
- Around line 134-141: The test
PairwiseDistanceC::FailsWithMismatchedFloatOutputDtype is asserting an error
message that references "float16 inputs" even though the inputs used are
float32; update the expected substring in
expect_pairwise_distance_error_contains to a more accurate and specific phrase
such as "match the input dtype" (or "must match the input dtype") so the
assertion reflects the actual failure mode for float_dtype(32) inputs and
mismatched float_dtype(64) output; locate the call to
expect_pairwise_distance_error_contains in the TEST and replace the current
substring accordingly.

In `@rust/cuvs/src/dlpack.rs`:
- Around line 212-218: rmm_free_tensor currently calls Resources::new().unwrap()
inside the deleter which can panic during Drop; change this to avoid panicking
by replacing the unwrap with fallible handling (e.g., call
Resources::new().map(|res| { let bytes = dl_tensor_bytes(&(*self_).dl_tensor);
let _ = ffi::cuvsRMMFree(res.0, (*self_).dl_tensor.data as *mut _, bytes);
}).unwrap_or_else(|err| { log the error via your logger and skip the free })) or
alternatively cache a Resources handle for reuse so the deleter never creates
resources; ensure you reference rmm_free_tensor, Resources::new(),
dl_tensor_bytes, and ffi::cuvsRMMFree when making the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b9c71ae5-b00c-4f30-9d47-c12f03247074

📥 Commits

Reviewing files that changed from the base of the PR and between c727525 and 4d1d3e4.

📒 Files selected for processing (15)
  • c/include/cuvs/distance/pairwise_distance.h
  • c/src/distance/pairwise_distance.cpp
  • c/tests/distance/pairwise_distance_c.cu
  • rust/cuvs/examples/cagra.rs
  • rust/cuvs/src/brute_force.rs
  • rust/cuvs/src/cagra/index.rs
  • rust/cuvs/src/cagra/mod.rs
  • rust/cuvs/src/cluster/kmeans/mod.rs
  • rust/cuvs/src/distance/mod.rs
  • rust/cuvs/src/dlpack.rs
  • rust/cuvs/src/ivf_flat/index.rs
  • rust/cuvs/src/ivf_flat/mod.rs
  • rust/cuvs/src/ivf_pq/index.rs
  • rust/cuvs/src/ivf_pq/mod.rs
  • rust/cuvs/src/vamana/index.rs
✅ Files skipped from review due to trivial changes (2)
  • c/include/cuvs/distance/pairwise_distance.h
  • rust/cuvs/src/ivf_pq/mod.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant