[improvement](be) Optimize count on nullable column by zclllyybb · Pull Request #64166 · apache/doris

zclllyybb · 2026-06-06T07:24:13Z

Count aggregation without GROUP BY reaches AggFnEvaluator::execute_single_add(), which calls add_batch_single_place(). AggregateFunctionCount and AggregateFunctionCountNotNullUnary previously inherited the row-by-row helper there, so count(*) and count(nullable_expr) paid per-row add/is_null_at costs even when all rows were aggregated into one state.

This patch adds batch implementations: count(*) increments the state once by batch_size, while unary count(nullable_expr) checks the nullable null map once and fast-paths the no-NULL case to count += batch_size. When NULLs exist it uses simd::count_zero_num() over the null map to count non-NULL rows. The nullable class name is kept because SQL count(expr) counts non-NULL values, not NULL values.

Performance:
test with sql

select count(nullable(number)) from numbers("number"="1000000000");

select count(nullable(if(number >= 0, null, number))) from numbers("number"="1000000000");

select count(nullable(if(number % 2 = 0, number, null))) from numbers("number"="1000000000");

get result

 Scenario     before median / mean    after median / mean    median diff
━━━━━━━━━━━  ━━━━━━━━━━━━━━━━━━━━━━  ━━━━━━━━━━━━━━━━━━━━━  ━━━━━━━━━━━━━
 non NULL           645 / 648.6 ms         555 / 556.4 ms         -14.0%
───────────  ──────────────────────  ─────────────────────  ─────────────
 all NULL         1541 / 1539.6 ms       1448 / 1450.6 ms          -6.0%
───────────  ──────────────────────  ─────────────────────  ─────────────
 half NULL        4256 / 4261.2 ms       4192 / 4232.2 ms          -1.5%

Count aggregation without GROUP BY reaches AggFnEvaluator::execute_single_add(), which calls add_batch_single_place(). AggregateFunctionCount and AggregateFunctionCountNotNullUnary previously inherited the row-by-row helper there, so count(*) and count(nullable_expr) paid per-row add/is_null_at costs even when all rows were aggregated into one state. This patch adds batch implementations: count(*) increments the state once by batch_size, while unary count(nullable_expr) checks the nullable null map once and fast-paths the no-NULL case to count += batch_size. When NULLs exist it uses simd::count_zero_num() over the null map to count non-NULL rows. The nullable class name is kept because SQL count(expr) counts non-NULL values, not NULL values. Performance validation used release BE binaries before/after the patch on the same temporary single FE/BE cluster, with operator_test-like session settings, 1e9-row numbers() inputs, warmups excluded, and EXPLAIN confirming partial_count(nullable(...)) plans. SQL cases and median results: 1. select count(nullable(number)) from numbers("number"="1000000000"); before 645 ms, after 555 ms, -14.0%. 2. select count(nullable(if(number >= 0, null, number))) from numbers("number"="1000000000"); before 1541 ms, after 1448 ms, -6.0%. 3. select count(nullable(if(number % 2 = 0, number, null))) from numbers("number"="1000000000"); before 4256 ms, after 4192 ms, -1.5%. A second after-binary retest produced medians of 548 / 1447 / 4166 ms for the same three cases, matching the measured improvement shape: largest gain for no-NULL nullable input, modest gain for all-NULL, and near-neutral for partial NULL where expression/null-map work dominates.

hello-stephen · 2026-06-06T07:24:18Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

zclllyybb · 2026-06-06T07:24:20Z

/review

zclllyybb · 2026-06-06T07:24:23Z

run buildall

github-actions

Review result: no blocking issues found.

Critical checkpoint conclusions:

Goal and proof: The PR optimizes no-GROUP-BY count aggregation by overriding add_batch_single_place for count(*) and unary count(nullable_expr). The code accomplishes that goal for the single-place path; added BE unit coverage checks nullable inputs with and without NULLs.
Scope and clarity: The change is small and focused on the aggregate count implementation plus targeted tests.
Concurrency and lifecycle: No new shared state, threads, locks, static initialization, or non-obvious lifecycle management are introduced.
Configuration and compatibility: No new config, protocol, storage format, or persisted metadata changes.
Parallel paths: Grouped add_batch, selected add, streaming serialization, merge, and window paths remain unchanged. The optimized path is specifically AggFnEvaluator::execute_single_add, matching the PR intent.
Conditional checks: The nullable fast path uses the existing ColumnNullable null-map API; batch_size is bounded by the null map with DCHECK and is consistent with execute_single_add passing the current block row count.
Testing: Added tests cover ordinary count, nullable all-non-null, and nullable mixed-null cases through the aggregate test harness. An all-null explicit case would be additional coverage but is not required for correctness of the reviewed change.
Observability: No new observability is needed for this local CPU optimization.
Transactions/persistence/data writes: Not applicable.
FE/BE variables: Not applicable.
Performance: The implementation removes per-row virtual/add/is_null_at overhead in the intended hot path and uses existing SIMD null-map counting; no obvious new hot-path anti-pattern found.

User focus points: No additional user-provided review focus was present.

Verification: I attempted ./run-be-ut.sh --run --filter=AggregateFunctionCountTest.*, but the runner environment failed before compiling BE UT because thirdparty/installed/bin/protoc is missing during gensrc generation. No code-level test failure was observed.

hello-stephen · 2026-06-06T08:06:50Z

TPC-H: Total hot run time: 29148 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a6b434bf478cb07fc54f985166f90e9a61762dfd, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17715	4039	3976	3976
q2	q3	10755	1391	814	814
q4	4683	472	343	343
q5	7551	893	592	592
q6	187	176	135	135
q7	778	855	638	638
q8	9399	1720	1596	1596
q9	5775	4550	4492	4492
q10	6790	1803	1510	1510
q11	446	272	252	252
q12	623	430	297	297
q13	18107	3390	2786	2786
q14	264	261	244	244
q15	q16	824	776	716	716
q17	1009	987	870	870
q18	6748	5663	5496	5496
q19	1301	1328	1127	1127
q20	520	401	266	266
q21	6245	2882	2686	2686
q22	445	374	312	312
Total cold run time: 100165 ms
Total hot run time: 29148 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5080	4627	4685	4627
q2	q3	4840	5268	4708	4708
q4	2078	2182	1379	1379
q5	4900	4850	4705	4705
q6	235	173	128	128
q7	1861	1836	1588	1588
q8	2406	2126	2129	2126
q9	7976	7622	7366	7366
q10	4728	4696	4251	4251
q11	523	380	355	355
q12	725	738	518	518
q13	2978	3340	2827	2827
q14	290	284	243	243
q15	q16	669	685	618	618
q17	1299	1245	1243	1243
q18	7170	6789	6893	6789
q19	1140	1091	1123	1091
q20	2198	2206	1950	1950
q21	5261	4555	4411	4411
q22	534	467	419	419
Total cold run time: 56891 ms
Total hot run time: 51342 ms

hello-stephen · 2026-06-06T08:17:41Z

TPC-DS: Total hot run time: 168921 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a6b434bf478cb07fc54f985166f90e9a61762dfd, data reload: false

query5	4311	639	461	461
query6	442	207	178	178
query7	4817	564	311	311
query8	377	226	203	203
query9	8785	4031	4018	4018
query10	457	306	266	266
query11	5936	2331	2161	2161
query12	156	101	107	101
query13	1332	615	433	433
query14	6438	5440	5052	5052
query14_1	4395	4384	4405	4384
query15	209	199	180	180
query16	1001	459	441	441
query17	957	727	606	606
query18	2461	497	355	355
query19	220	191	146	146
query20	113	110	107	107
query21	219	137	119	119
query22	13631	13702	13387	13387
query23	17391	16494	16093	16093
query23_1	16270	16354	16236	16236
query24	7428	1767	1290	1290
query24_1	1298	1286	1299	1286
query25	550	458	373	373
query26	1320	322	164	164
query27	2694	568	342	342
query28	4458	2009	2022	2009
query29	1049	617	472	472
query30	303	233	188	188
query31	1126	1072	961	961
query32	112	63	58	58
query33	528	318	255	255
query34	1167	1119	646	646
query35	749	777	683	683
query36	1406	1454	1253	1253
query37	154	110	90	90
query38	3212	3129	3028	3028
query39	951	923	888	888
query39_1	895	899	874	874
query40	215	122	101	101
query41	65	62	60	60
query42	97	93	93	93
query43	316	317	278	278
query44	
query45	194	189	176	176
query46	1096	1232	763	763
query47	2395	2386	2260	2260
query48	378	431	293	293
query49	634	480	372	372
query50	990	351	257	257
query51	4334	4509	4230	4230
query52	88	89	77	77
query53	251	275	197	197
query54	272	248	199	199
query55	76	76	68	68
query56	241	220	229	220
query57	1439	1439	1319	1319
query58	263	211	212	211
query59	1605	1692	1440	1440
query60	288	250	234	234
query61	162	161	155	155
query62	692	664	590	590
query63	228	184	181	181
query64	2583	795	638	638
query65	
query66	1806	460	351	351
query67	29750	29647	29481	29481
query68	
query69	422	300	266	266
query70	926	962	924	924
query71	314	224	212	212
query72	2958	2699	2403	2403
query73	844	808	426	426
query74	5120	4931	4775	4775
query75	2658	2573	2239	2239
query76	2300	1147	805	805
query77	352	374	277	277
query78	12306	12312	11858	11858
query79	1279	1064	698	698
query80	522	466	385	385
query81	450	280	252	252
query82	237	156	121	121
query83	270	282	248	248
query84	289	141	112	112
query85	840	564	433	433
query86	327	309	295	295
query87	3346	3364	3262	3262
query88	3601	2727	2709	2709
query89	410	388	327	327
query90	2167	172	177	172
query91	173	166	141	141
query92	61	64	56	56
query93	1414	1550	964	964
query94	545	374	330	330
query95	655	396	347	347
query96	1003	885	339	339
query97	2708	2697	2547	2547
query98	217	210	202	202
query99	1168	1168	1050	1050
Total cold run time: 249945 ms
Total hot run time: 168921 ms

hello-stephen · 2026-06-06T10:13:54Z

BE Regression && UT Coverage Report

Increment line coverage 100.00% (15/15) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	73.87% (28265/38262)
Line Coverage	57.86% (307408/531309)
Region Coverage	54.76% (257869/470946)
Branch Coverage	56.10% (111763/199209)

github-actions · 2026-06-08T03:51:42Z

PR approved by at least one committer and no changes requested.

github-actions · 2026-06-08T03:51:44Z

PR approved by anyone and no changes requested.

Count aggregation without GROUP BY reaches AggFnEvaluator::execute_single_add(), which calls add_batch_single_place(). AggregateFunctionCount and AggregateFunctionCountNotNullUnary previously inherited the row-by-row helper there, so count(*) and count(nullable_expr) paid per-row add/is_null_at costs even when all rows were aggregated into one state. This patch adds batch implementations: count(*) increments the state once by batch_size, while unary count(nullable_expr) checks the nullable null map once and fast-paths the no-NULL case to count += batch_size. When NULLs exist it uses simd::count_zero_num() over the null map to count non-NULL rows. The nullable class name is kept because SQL count(expr) counts non-NULL values, not NULL values. Performance: test with sql ```sql select count(nullable(number)) from numbers("number"="1000000000"); select count(nullable(if(number >= 0, null, number))) from numbers("number"="1000000000"); select count(nullable(if(number % 2 = 0, number, null))) from numbers("number"="1000000000"); ``` get result ``` Scenario before median / mean after median / mean median diff ━━━━━━━━━━━ ━━━━━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━ non NULL 645 / 648.6 ms 555 / 556.4 ms -14.0% ─────────── ────────────────────── ───────────────────── ───────────── all NULL 1541 / 1539.6 ms 1448 / 1450.6 ms -6.0% ─────────── ────────────────────── ───────────────────── ───────────── half NULL 4256 / 4261.2 ms 4192 / 4232.2 ms -1.5% ```

github-actions Bot reviewed Jun 6, 2026

View reviewed changes

zclllyybb added the dev/4.1.x label Jun 7, 2026

Mryange approved these changes Jun 8, 2026

View reviewed changes

github-actions Bot added the approved Indicates a PR has been approved by one committer. label Jun 8, 2026

github-actions Bot added the reviewed label Jun 8, 2026

zclllyybb merged commit 5d97b29 into apache:master Jun 8, 2026
32 of 33 checks passed

zclllyybb deleted the opt_count branch June 8, 2026 03:52

github-actions Bot mentioned this pull request Jun 8, 2026

branch-4.1: [improvement](be) Optimize count on nullable column #64166 #64202

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[improvement](be) Optimize count on nullable column#64166

[improvement](be) Optimize count on nullable column#64166
zclllyybb merged 1 commit into
apache:masterfrom
zclllyybb:opt_count

zclllyybb commented Jun 6, 2026

Uh oh!

hello-stephen commented Jun 6, 2026

Uh oh!

zclllyybb commented Jun 6, 2026

Uh oh!

zclllyybb commented Jun 6, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

hello-stephen commented Jun 6, 2026

Uh oh!

hello-stephen commented Jun 6, 2026

Uh oh!

hello-stephen commented Jun 6, 2026

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zclllyybb commented Jun 6, 2026

Uh oh!

hello-stephen commented Jun 6, 2026

Uh oh!

zclllyybb commented Jun 6, 2026

Uh oh!

zclllyybb commented Jun 6, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

hello-stephen commented Jun 6, 2026

Uh oh!

hello-stephen commented Jun 6, 2026

Uh oh!

hello-stephen commented Jun 6, 2026

BE Regression && UT Coverage Report

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants