Skip to content

feat(storage): implement GCS overrun logging and resume prevention#16150

Open
v-pratap wants to merge 8 commits into
googleapis:mainfrom
v-pratap:log-extra-reads-2
Open

feat(storage): implement GCS overrun logging and resume prevention#16150
v-pratap wants to merge 8 commits into
googleapis:mainfrom
v-pratap:log-extra-reads-2

Conversation

@v-pratap

Copy link
Copy Markdown
Contributor

This commit implements overrun logging, satisfied-resume prevention, and server-side transcoding bypass for both the synchronous and asynchronous GCS download paths.

Synchronous Path:

  • Introduced a private 'OverrunLoggingObjectReadSource' decorator inside 'object_read_streambuf.cc' that wraps the transport source to track received bytes lock-free.
  • Logs a warning with the exact overshoot byte count, bucket, and object name if GCS sends more bytes than requested.
  • Leaves 'object_read_streambuf.h' and 'object_read_stream.h' completely clean, untouched, and thread-safe.

Asynchronous Path (gRPC BiDi Stream & Streaming Reader):

  • Integrated chunk-level byte tracking inside 'ReadRange' and 'AsyncReaderConnectionResume'.
  • Under server-side transcoding (gzip), suppresses overrun warnings and bypasses checksum validation failures (since decompressed bytes naturally exceed the compressed request and mismatch the original checksums).
  • Implemented satisfied-resume prevention: if a connection fails but the requested number of bytes has already been received, we refuse to reconnect to prevent GCS from sending unwanted extra bytes, and return Status::OK directly, masking any subsequent stream errors.
  • Defined overloaded constructors to maintain 100% backward compatibility for all existing tests and client code.

This commit implements overrun logging, satisfied-resume prevention, and server-side transcoding bypass for both the synchronous and asynchronous GCS download paths.

Synchronous Path:
- Introduced a private 'OverrunLoggingObjectReadSource' decorator inside 'object_read_streambuf.cc' that wraps the transport source to track received bytes lock-free.
- Logs a warning with the exact overshoot byte count, bucket, and object name if GCS sends more bytes than requested.
- Leaves 'object_read_streambuf.h' and 'object_read_stream.h' completely clean, untouched, and thread-safe.

Asynchronous Path (gRPC BiDi Stream & Streaming Reader):
- Integrated chunk-level byte tracking inside 'ReadRange' and 'AsyncReaderConnectionResume'.
- Under server-side transcoding (gzip), suppresses overrun warnings and bypasses checksum validation failures (since decompressed bytes naturally exceed the compressed request and mismatch the original checksums).
- Implemented satisfied-resume prevention: if a connection fails but the requested number of bytes has already been received, we refuse to reconnect to prevent GCS from sending unwanted extra bytes, and return Status::OK directly, masking any subsequent stream errors.
- Defined overloaded constructors to maintain 100% backward compatibility for all existing tests and client code.

Testing:
- Added 15 new, comprehensive unit tests covering all overrun, tail-read, transcoding, and resume scenarios.
@v-pratap v-pratap requested review from a team as code owners June 10, 2026 10:52
@product-auto-label product-auto-label Bot added the api: storage Issues related to the Cloud Storage API. label Jun 10, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces overrun logging and transcoding detection when reading objects from GCS, updating classes like ReadRange, AsyncReaderConnectionResume, and ObjectReadStreambuf to track received bytes and suppress warnings or checksum validation for transcoded data. The code review identified several critical issues: data races in ObjectDescriptorImpl::OnRead due to accessing metadata_ after unlocking and in AsyncReaderConnectionResume::Reconnect due to accessing impl_ without a lock; potential undefined behavior from signed integer overflow when negating p.start at its minimum value; duplicated overrun check logic in OverrunLoggingObjectReadSource; and a bug where casting a negative requested_length_ to std::size_t causes an unsigned wrap-around, bypassing the overrun check.

Comment thread google/cloud/storage/internal/async/object_descriptor_impl.cc
Comment thread google/cloud/storage/internal/async/reader_connection_resume.cc Outdated
Comment thread google/cloud/storage/internal/async/object_descriptor_impl.cc Outdated
Comment thread google/cloud/storage/internal/object_read_streambuf.cc Outdated
Comment thread google/cloud/storage/internal/object_read_streambuf.cc
v-pratap added 2 commits June 10, 2026 10:56
- Fix data race on metadata_ in ObjectDescriptorImpl::OnRead by extracting is_transcoded status before unlocking the mutex.
- Fix data race on impl_ in AsyncReaderConnectionResume::Reconnect by calling CurrentImpl() to safely acquire the lock.
- Fix potential signed integer negation overflow undefined behavior in ObjectDescriptorImpl::Read by explicitly checking for numeric_limits min() and capping the limit.
- De-duplicate overrun check logic in OverrunLoggingObjectReadSource by calling CheckOverrun() directly in Read().
- Fix potential signed-to-unsigned cast wrap-around bug in CheckOverrun() by checking that requested_length_ is non-negative before casting.
- Dynamically verify if server-side decompressive transcoding actually occurred by comparing total received bytes to the object's original stored size (when available).
- Prevents silently ignoring real data corruption when a user explicitly requests compressed gzip-encoded bytes and receives a corrupted file.
- Falls back to the standard is_transcoded flag if the object size is unknown (maintaining 100% backward compatibility and preventing test breakages).
- Applied this robust check to both ReadRange (async BiDi range reads) and AsyncReaderConnectionResume (async streaming reader).
- Added a new unit test 'TranscodingFailsOnRealChecksumMismatch' to verify that a checksum mismatch is correctly reported as an error if the received size matches the compressed object size.
- Resolved a compiler signedness warning in read_range.cc by casting object_size to std::size_t.
- Made the 'logged_warning_' flag in ObjectReadStreambuf a std::atomic<bool> to ensure absolute thread-safety.
@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 98.61538% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.21%. Comparing base (fd73bee) to head (c09eeec).

Files with missing lines Patch % Lines
...d/storage/internal/async/object_descriptor_impl.cc 80.00% 3 Missing ⚠️
...le/cloud/storage/internal/async/read_range_test.cc 98.27% 2 Missing ⚠️
...le/cloud/storage/internal/object_read_streambuf.cc 95.34% 2 Missing ⚠️
google/cloud/storage/internal/async/read_range.cc 95.65% 1 Missing ⚠️
...storage/internal/async/reader_connection_resume.cc 96.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #16150      +/-   ##
==========================================
+ Coverage   92.20%   92.21%   +0.01%     
==========================================
  Files        2264     2264              
  Lines      208864   209497     +633     
==========================================
+ Hits       192579   193191     +612     
- Misses      16285    16306      +21     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

v-pratap added 4 commits June 10, 2026 14:04
- Refactored overrun warning logs to print once at the end of the download (stream Close or completion/failure) rather than immediately on the first overrunning chunk.
- Aligns the C++ GCS client's logging behavior with the official Google Cloud Go client (cloud.google.com/go/storage).
- Ensures that the warning log reports the true, final total overrun byte count instead of a partial overrun count from the first chunk.
- Sync Path: Removed CheckOverrun from the hot Read() path in ObjectReadStreambuf, moving it to Close() and the decorator destructor (~OverrunLoggingObjectReadSource).
- Async Range Path: Deferred CheckOverrun in ReadRange to the final range_end and OnFinish paths.
- Async Resume Path: Deferred CheckOverrun in AsyncReaderConnectionResume to the final status.ok, satisfied, and stop paths.
- Updated all unit tests in object_read_streambuf_test.cc, read_range_test.cc, and reader_connection_resume_test.cc to drive the streams to completion (EOF/Close) before verifying the logs.
- Converted the logged_warning_ check-and-set in ObjectReadStreambuf to a thread-safe atomic exchange (!logged_warning_.exchange(true)) to prevent any potential check-then-act race conditions.
- Added explicit static_cast<std::size_t> casts to requested_length_ in both ReadRange and AsyncReaderConnectionResume overrun logging statements. This prevents signed/unsigned mixing arithmetic and conforms perfectly to strict compiler sign-comparison flags.
- Documented and verified architectural correctness of total_received_bytes_ relative counting (essential for ranged reads starting at non-zero offsets) and the >0 read_limit check (mandatory for GCS proto3 default semantics).
@v-pratap v-pratap requested a review from kalragauri June 11, 2026 04:56
std::move(*response->mutable_read_handle());
}
auto copy = it->active_ranges;
// Release the lock while notifying the ranges. The notifications may trigger

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move this comment before lk.unlock()?


auto range = std::make_shared<ReadRange>(p.start, p.length, hash_function,
std::move(hash_validator));
absl::optional<std::int64_t> limit;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pls add a comment about how limit is calculated for readers who are unfamiliar with GCS range request behaviors.

std::string object_name_;
std::size_t received_bytes_ = 0;
bool is_transcoded_ = false;
std::atomic<bool> logged_warning_{false};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use std::atomic on just this member? If concurrent access is possible, this class has data races on its other members.

CheckOverrun();
auto result = std::move(*hash_validator_).Finish(hash_function_->Finish());
if (result.is_mismatch) {
bool transcoded_download =

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a comment explaining why we're performing this check. Same comment for this change under reader_connection_resume.cc.

absl::optional<std::int64_t> requested_length_;
std::string bucket_name_;
std::string object_name_;
std::size_t received_bytes_ = 0;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can/should received_bytes_ be std::int64_t similar to total_received_bytes_ under reader_connection_resume.h? This might reduce some of the casting in the implementation.

@kalragauri

Copy link
Copy Markdown
Contributor

Overall looks good - added some minor comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: storage Issues related to the Cloud Storage API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants