Skip to content

JIT: Tail-merge don't produce empty blocks#129092

Merged
AndyAyersMS merged 1 commit into
dotnet:mainfrom
BoyBaykiller:tail-merge-remove-blocks
Jun 11, 2026
Merged

JIT: Tail-merge don't produce empty blocks#129092
AndyAyersMS merged 1 commit into
dotnet:mainfrom
BoyBaykiller:tail-merge-remove-blocks

Conversation

@BoyBaykiller

Copy link
Copy Markdown
Contributor

I had an other change for improving the crossJumpVictim selection logic in tail-merge, but it had random regressions because downstream phases get confused depending on where empty blocks are. This PR stops producing empty blocks in tail-merge which fixes that.
It also makes sense given how we are running directly after "optimize control flow" phase which just got rid of all empty blocks.

@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 7, 2026
@dotnet-policy-service dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label Jun 7, 2026
@dotnet-policy-service

Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@BoyBaykiller

BoyBaykiller commented Jun 7, 2026

Copy link
Copy Markdown
Contributor Author

This itself has some interesting regressions related to how the backend deals with the block layout.
Particularly arround bbWeight=0 and BBJ_RETURN it seems, but haven't looked deeply.

I see cases like this where we get weird unnecessary jumps.

G_M48648_IG02:  ;; offset=0x0000
       vucomiss xmm0, xmm1
       jp       SHORT G_M48648_IG03
       je       SHORT G_M48648_IG08
						;; size=8 bbWeight=1 PerfScore 4.00
G_M48648_IG03:  ;; offset=0x0008
       vucomiss xmm0, xmm0
       jp       SHORT G_M48648_IG04
       vucomiss xmm0, xmm1
       jbe      SHORT G_M48648_IG07
						;; size=12 bbWeight=1 PerfScore 6.00
G_M48648_IG04:  ;; offset=0x0014
       vmovaps  xmm1, xmm0
						;; size=4 bbWeight=1 PerfScore 0.25
G_M48648_IG05:  ;; offset=0x0018
       vmovaps  xmm0, xmm1
						;; size=4 bbWeight=1 PerfScore 0.25
G_M48648_IG06:  ;; offset=0x001C
       ret      
						;; size=1 bbWeight=1 PerfScore 1.00
G_M48648_IG07:  ;; offset=0x001D
       jmp      SHORT G_M48648_IG05
						;; size=2 bbWeight=0 PerfScore 0.00
G_M48648_IG08:  ;; offset=0x001F
       vmovd    eax, xmm1
       test     eax, eax
       jl       SHORT G_M48648_IG04
       jmp      SHORT G_M48648_IG07
						;; size=10 bbWeight=0 PerfScore 0.00

@BoyBaykiller

Copy link
Copy Markdown
Contributor Author

@AndyAyersMS PTAL.

@AndyAyersMS

Copy link
Copy Markdown
Member

Do we get similar results if we swap the order of the second tail merge pass with the flow graph opts? Or is this enabling more merging?

@BoyBaykiller

Copy link
Copy Markdown
Contributor Author

I've tried moving PHASE_OPTIMIZE_FLOW after PHASE_HEAD_TAIL_MERGE2 and also tried doTailDuplication: false, but it had bad diffs.
Mayeb it's intentional that PHASE_OPTIMIZE_FLOW expands out all tails and PHASE_HEAD_TAIL_MERGE2 then merges them.

@BoyBaykiller

Copy link
Copy Markdown
Contributor Author

@AndyAyersMS What do you think? 🙂

@AndyAyersMS

Copy link
Copy Markdown
Member

This alters codegen in some of the important contains methods, can you look into what changes happen there and how it might impact perf?

eg System.MemoryExtensions:Contains[char](System.ReadOnlySpan1[char],char):bool (Tier1)`

@BoyBaykiller

Copy link
Copy Markdown
Contributor Author

MemoryExtensions:Contains[char] + String:Contains(char) + SpanHelpers:ContainsValueType[short] all call into PackedSpanHelpers.Contain and have the same pattern of diffs.
The crossJumpTarget is moved up which allows us to use SHORT jump more often and save some size. Perfscore is the same.

The only type pattern of regression I could discern is the one I already wrote about above. And also an other one where we no longer have fallthrough, but I have a feeling these two are closely related. Here an example from System.Text.RegularExpressions.RegexNode:get_IsBacktrackingConstruct: https://www.diffchecker.com/3Jk1zdJf/.
I was planing on opening issues for it, once merged.

@AndyAyersMS AndyAyersMS left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants