Skip to content

feat(storage): log additional bytes received from GCS in read path#16152

Draft
nidhiii-27 wants to merge 1 commit into
mainfrom
feat/storage-log-bytes-read
Draft

feat(storage): log additional bytes received from GCS in read path#16152
nidhiii-27 wants to merge 1 commit into
mainfrom
feat/storage-log-bytes-read

Conversation

@nidhiii-27

Copy link
Copy Markdown
Contributor

When the GCS client requests a specific range of bytes but the transport provides more bytes than expected, we now emit a warning log. This matches the behavior of other GCS client libraries and aids in debugging. Decompressed objects are ignored because their wire size varies from the requested size.

Fixes: #475824752

[Generated-by: AI]

When the GCS client requests a specific range of bytes but the transport provides more bytes than expected, we now emit a warning log. This matches the behavior of other GCS client libraries and aids in debugging. Decompressed objects are ignored because their wire size varies from the requested size.

Fixes: #475824752

[Generated-by: AI]
@nidhiii-27 nidhiii-27 requested review from a team as code owners June 11, 2026 09:39
@product-auto-label product-auto-label Bot added the api: storage Issues related to the Cloud Storage API. label Jun 11, 2026
@nidhiii-27 nidhiii-27 marked this pull request as draft June 11, 2026 09:40

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces tracking of the total bytes received during GCS object reads in ObjectReadStreambuf to log a warning if more bytes are received than requested. The review feedback highlights three main areas for improvement: first, total_bytes_received_ is only updated in xsgetn() and must also be updated in underflow() to prevent under-counting; second, a defensive check is needed to ensure range.end > range.begin to avoid incorrect negative expected byte calculations; and finally, for consistency, the member variable transformation_ should be accessed directly instead of using the transformation() accessor.

if (!size_) size_ = std::move(read->size);
if (!transformation_) transformation_ = std::move(read->transformation);

total_bytes_received_ += read->bytes_received;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The total_bytes_received_ counter is only updated here in xsgetn(). However, ObjectReadStreambuf also overrides underflow() to handle buffered/single-character reads (e.g., via std::getline or operator>>). If a user reads from the stream using those methods, underflow() will read from the source but total_bytes_received_ will not be incremented, leading to an under-count and missed warnings.

Please ensure total_bytes_received_ is also updated in underflow(), or refactor the byte-tracking logic into a common helper method called by both xsgetn() and underflow().

Comment on lines +43 to +46
if (request.HasOption<ReadRange>()) {
auto const& range = request.GetOption<ReadRange>().value();
return range.end - range.begin;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If ReadRange is configured such that range.end <= range.begin (or if it represents an open-ended range where end is not set/defaults to 0), range.end - range.begin will be negative or zero. This would cause total_bytes_received_ > *expected_bytes_ to evaluate to true incorrectly, triggering false warning logs.

We should defensively check that range.end > range.begin before returning the difference.

  if (request.HasOption<ReadRange>()) {
    auto const& range = request.GetOption<ReadRange>().value();
    if (range.end > range.begin) {
      return range.end - range.begin;
    }
    return absl::nullopt;
  }
References
  1. Prefer defensive code, such as explicit checks, even if they seem redundant based on the current implementation of a framework, as the framework's contract may change in the future.


void ObjectReadStreambuf::Close() {
if (expected_bytes_.has_value() && total_bytes_received_ > *expected_bytes_) {
if (!transformation().has_value()) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with how other member variables (like expected_bytes_, total_bytes_received_, bucket_name_, and object_name_) are accessed in this function and in xsgetn(), we should access the member variable transformation_ directly instead of calling the accessor method transformation().

    if (!transformation_.has_value()) {

@nidhiii-27

Copy link
Copy Markdown
Contributor Author

Addressed all PR review comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-generated api: storage Issues related to the Cloud Storage API. storage-feature-parity

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant