[feature](be) Add OLAP scan filter profile counters#64165
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Pull request overview
This PR introduces a new BE-side profiling framework to expose detailed, per-filter selectivity for OLAP scans, aiming to make each conjunct/runtime filter/key-range/index/execution-stage filtering step visible in the query profile via ScanFilterInfo and KeyRangeInfo.
Changes:
- Add
ScanFilterProfile/ScanFilterStatsinfrastructure and materialize per-filter/stage counters intoRuntimeProfile. - Plumb scan-local filter handles through scan operator normalization, storage predicates, index filtering (zone map/bloom/dict/inverted/ANN), and execution-stage conjunct evaluation.
- Extend runtime filter consumer profiling to publish RF wait/always-true/debug metadata into the unified scan-filter profile structure.
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| design/scan-filter-profile.md | Design draft describing the intended per-filter/stage profiling model and semantics. |
| be/src/runtime/scan_filter_profile.h | Declares scan-local filter kinds/stages, stats, handles, and profile materialization API. |
| be/src/runtime/scan_filter_profile.cpp | Implements stats recording, snapshots, sorting, and profile materialization to ScanFilterInfo/KeyRangeInfo. |
| be/src/exec/operator/scan_operator.h | Adds ScanFilterProfile ownership on scan local state and registration helper. |
| be/src/exec/operator/scan_operator.cpp | Creates scan filter profile based on profile_level, attaches handles to conjuncts/pushdowns, and materializes on close. |
| be/src/exec/operator/olap_scan_operator.h | Adds key-range synthetic filter handle storage in OLAP scan local state. |
| be/src/exec/operator/olap_scan_operator.cpp | Registers synthetic key-range filter and tracks source filter lineage for key ranges. |
| be/src/exec/scan/olap_scanner.cpp | Plumbs scan filter profile and key-range handle into tablet reader params. |
| be/src/storage/tablet/tablet_reader.h | Extends reader params to carry scan filter profile + key-range synthetic handle. |
| be/src/storage/tablet/tablet_reader.cpp | Propagates scan filter fields into reader context; attaches handles to function-filter predicates. |
| be/src/storage/rowset/rowset_reader_context.h | Plumbs scan filter profile + key-range handle through rowset reader context. |
| be/src/storage/rowset/beta_rowset_reader.cpp | Propagates scan filter fields into storage read options for segment iterators. |
| be/src/storage/iterators.h | Adds scan filter profile + key-range handle fields to StorageReadOptions. |
| be/src/storage/predicate/column_predicate.h | Adds ScanFilterHandle attachment/storage on column predicates. |
| be/src/storage/predicate/block_column_predicate.h | Exposes scan-filter handle from block predicates; adds scan-filter-aware AND evaluation methods. |
| be/src/storage/predicate/block_column_predicate.cpp | Implements scan-filter-aware evaluation wrappers that record per-predicate stage stats. |
| be/src/storage/segment/segment_iterator.h | Avoids emitting old predicate debug sections when scan-filter profiling is enabled. |
| be/src/storage/segment/segment_iterator.cpp | Records per-filter stats for key range, inverted/ANN, vector/short-circuit predicates, and common expr execution. |
| be/src/storage/segment/column_reader.h | Extends zone-map filtering helper API to accept input row ranges. |
| be/src/storage/segment/column_reader.cpp | Adds per-page input-row counting and per-filter stage recording for zone map/bloom/dict paths. |
| be/src/exprs/vexpr_context.h | Adds scan-filter handle attachment; adds optional stage parameter to conjunct execution/filter helpers. |
| be/src/exprs/vexpr_context.cpp | Records per-conjunct stage input/output rows during conjunct execution when stage is provided. |
| be/src/exec/scan/scanner.cpp | Passes residual stage tag into VExprContext::filter_block() for profiling. |
| be/src/exec/runtime_filter/runtime_filter_consumer.h | Adds API to publish runtime-filter stats into scan-filter profiling. |
| be/src/exec/runtime_filter/runtime_filter_consumer.cpp | Implements publishing RF wait/always-true/debug stats to ScanFilterProfile. |
| be/src/exec/runtime_filter/runtime_filter_consumer_helper.h | Extends realtime profile collection API to optionally target scan-filter profiling. |
| be/src/exec/runtime_filter/runtime_filter_consumer_helper.cpp | Publishes RF acquire time + per-RF stats to ScanFilterProfile when provided. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
5f54636 to
62dd05d
Compare
|
/review |
|
run buildall |
There was a problem hiding this comment.
I found one blocking observability regression in the new profile path. Existing review comments about scan-filter overhead, pass-through zone-map accounting, missing handles, and incremental inverted-index stats appear addressed in the current head, so I did not repeat them.
Critical checkpoint conclusions:
- Goal/test: The PR adds per-filter OLAP scan profile counters and mostly wires them through key range, index, vector, short-circuit, common-expr, residual, and runtime-filter paths. The runtime-filter profile fallback is incomplete for filters that do not become scan predicates. The PR currently reports only manual build/check testing; no automated test validates profile materialization or the timeout/disabled runtime-filter case.
- Scope/clarity: The implementation is focused on profile plumbing and keeps data-path behavior mostly unchanged, but it replaces the old runtime-filter profile output when scan-filter profiling is enabled.
- Concurrency/lifecycle: New shared profile state uses a mutex for descriptor vectors and relaxed atomics for counters, which is appropriate for statistics. Scanner shutdown happens before final materialization. No new lock-order issue found.
- Compatibility/config: No storage/protocol/config compatibility issue found. Profile output changes are user-visible.
- Parallel paths: OLAP scan storage, residual expression, common expr, runtime filter, and key-range paths were reviewed. Non-OLAP scan types still use the base scan profile behavior.
- Data correctness/transactions: No data visibility, transaction, delete-bitmap, or persistence impact found.
- Performance: The previously raised no-handle overhead paths are now guarded in current head. Some profile-enabled counting remains expected overhead.
- Observability: The new profile improves per-filter visibility, but currently drops per-runtime-filter details for runtime filters that never materialize as scan filter expressions.
User focus: No additional user-provided review focus was supplied.
62dd05d to
a4737f3
Compare
|
/review |
|
run buildall |
There was a problem hiding this comment.
I found one correctness issue in the new scan-filter profiling path.
Critical checkpoint conclusions:
- Goal/test: the PR adds unified scan-filter profiling for pushed-down, index, key-range, runtime-filter, and residual scan predicates. The implementation covers the main paths, but the late-arrival runtime-filter path can miss newly appended filters when cost-based conjunct ordering is enabled. No test coverage was found for late-arriving RFs with conjunct reordering and profile-level scan-filter output.
- Scope/minimality: the change is mostly focused on profiling, but it touches hot scan/storage paths; previously raised overhead issues are already covered by existing threads.
- Concurrency/lifecycle: scan-filter stats use relaxed atomics for statistics, which is appropriate. Late RF append is protected by
_conjuncts_lock, but the registration logic uses unstable vector indexes after sorting. - Config/compatibility: no new config or storage/protocol compatibility issue found.
- Parallel paths: OLAP residual/index/key-range paths are covered, but the late-arrival RF path is a distinct missed path.
- Tests: no additional user focus was provided; for that focus, no extra issue applies. The missing late-arrival/reorder profile test is the main coverage gap.
- Observability: the issue is an observability regression in the new profile itself, because the RF can execute as a residual predicate while being shown as not applied/no per-stage stats.
- Data correctness/transactions/memory: no data visibility, transaction, MoW delete-bitmap, or memory-safety blocking issue found in the reviewed changes.
Please fix the late-arrival handle assignment so newly appended RF contexts are registered before any full-vector reorder, or track the appended contexts directly instead of relying on [conjuncts_before, size) after sorting.
### What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary: Existing OLAP scan profiles only expose aggregate filter counters, making it hard to understand the selectivity of each conjunct, runtime filter, key range, and index-related filtering step. This change adds scan-local filter ids and materializes detailed ScanFilterInfo and KeyRangeInfo profile sections. Per-filter counters now show input rows, output rows, filtered rows, stage participation, source expression or runtime filter debug text, and runtime filter wait or always-true data where applicable. Runtime filter dynamic partition pruning is exposed as a separate ScanFilterInfo child because it prunes partitions/tablets before segment-level row filtering. Empty mechanism-specific fields are omitted when they do not participate, and runtime filter ids remain distinct from scan filter ids.
### Release note
Add detailed OLAP scan filter profile counters in query profile.
### Check List (For Author)
- Test: Manual test
- ./build.sh --be
- git diff --cached --check
- Behavior changed: Yes. Query profile includes detailed OLAP ScanFilterInfo and KeyRangeInfo counters.
- Does this need documentation: No
a4737f3 to
dd78f25
Compare
|
/review |
|
run buildall |
There was a problem hiding this comment.
I found one remaining blocking issue in the current PR diff: native file-scan predicates that are applied inside Parquet/ORC readers are not integrated with the new scan-filter profiling stages, so their real filtering is invisible and later residual counters are misleading.
Critical checkpoint conclusions:
- Goal/test coverage: the scan-filter profiling goal is only partially achieved; OLAP paths are covered more deeply, but native file-reader pushed-down predicate paths are not proven and currently report incorrect stage rows.
- Scope/minimality: the scan-filter changes are focused, but one parallel scan path was missed.
- Concurrency/lifecycle: no new blocking concurrency or lifecycle issue found beyond already-known review threads.
- Configuration/compatibility: no new config or storage/protocol compatibility concern found.
- Parallel paths: native file readers (Parquet/ORC) are a parallel predicate-evaluation path and need the same scan-filter accounting.
- Data correctness: no query-result correctness issue found in the remaining PR diff.
- Observability/performance: observability is materially wrong for file scans with reader-local pushed-down predicates. Existing review threads already cover the other scan-filter overhead/counter issues, so I did not duplicate them.
User focus: no additional user-provided review focus was specified.
| auto old_rows = block->rows(); | ||
| Status st = VExprContext::filter_block(_conjuncts, block, block->columns()); | ||
| Status st = VExprContext::filter_block(_conjuncts, block, block->columns(), | ||
| ScanFilterStage::EXEC_RESIDUAL); |
There was a problem hiding this comment.
Recording all remaining scanner conjuncts as EXEC_RESIDUAL here makes file-scan ScanFilterInfo wrong when native readers already applied pushed-down predicates. For Parquet, RowGroupReader filters _filter_conjuncts inside vparquet_group_reader.cpp before this block reaches the scanner, and ORC has the same pattern around its reader-local execute_conjuncts() calls. Those reader-local filters do not receive a scan-filter stage, so the real filtered rows are invisible; this residual pass then sees only surviving rows and often reports FilteredRows=0. Please record the reader-local predicate evaluations with an appropriate stage, or avoid attributing predicates already applied by the reader as residual-only work.
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 29683 ms |
TPC-DS: Total hot run time: 170320 ms |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary: Existing OLAP scan profiles only expose aggregate filter counters, making it hard to understand the selectivity of each conjunct, runtime filter, key range, and index-related filtering step. This PR adds scan-local filter IDs and materializes detailed ScanFilterInfo and KeyRangeInfo profile sections. Per-filter counters now show input rows, output rows, filtered rows, stage participation, source expression or runtime filter debug text, and runtime filter wait/always-true data where applicable. Runtime filter dynamic partition pruning is exposed as a separate ScanFilterInfo child because it prunes partitions/tablets before segment-level row filtering. Empty mechanism-specific fields are omitted when they do not participate, and runtime filter IDs remain distinct from scan filter IDs.
Release note
Add detailed OLAP scan filter profile counters in query profile.
Check List (For Author)