-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamp the generation of runtime division checks on ARM64 #111543
base: main
Are you sure you want to change the base?
Conversation
Fixes dotnet#64795 This patch introduces a new compilation phase that passes over the GenTrees looking for GT_DIV/GT_UDIV nodes on integral types, and morphs the code to introduce the necessary conformance checks (overflow/divide-by-zero) early on in the compilation pipeline. Currently these are added during the Emit phase, meaning optimizations don't run on any code introduced. The aim is to allow the compiler to make decisions on code position and instruction selection for these checks. For example on ARM64 this enables certain scenarios to choose the cbz instruction over cmp/beq, can lead to more compact code. It also allows some of the comparisons in the checks to be hoisted out of loops.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
This is WIP. I've taken a different approach to adding new nodes, instead adding a pass that modifies the HIR. The pass will run through all of the code in the function looking for The added HIR looks like this for the signed overflow check, for example. This is checking for
Here's the example @kunalspathak mentioned in #64795:
Before the change:
After the change:
The main difference is at label IG04, rather than a fixed sequence of compare and branch instructions chosen at the emit stage, the compiler has decided to build a logical expression for the overflow check and emit a The approach is working well when: It seems to have an adverse effect on MinOpts though, because splitting the tree will often spill and there aren't any optimization passes running to clear up these spills. At the moment I haven't focused on the efficiency of the pass itself but I believe it could be improved. I could borrow the recursive traversals in the earlier morph phase to build a work-list for where checks need to be added. Then the pass can be linear over a pre-built list of nodes rather than a search in a loop. I would just have to be careful to update all of the locations of the nodes after any trees are split, but I think this should be possible. I've also had to make a temporary fix on a problem with the tree splitting code where it wasn't correctly updating the node flags after splitting out side effects. After splitting the tree I traverse it post-order to update all of the flags. There might be a more efficient way of doing this. |
I think the build is failing on Release mode due to use of |
What do you need this for? Increasing the size of |
Fixes #64795
This patch introduces a new compilation phase that passes over the
GenTrees
looking forGT_DIV/GT_UDIV
nodes on integral types, and morphs the code to introduce the necessary conformance checks (overflow/divide-by-zero) early on in the compilation pipeline. Currently these are added during the Emit phase, meaning optimizations don't run on any code introduced.The aim is to allow the compiler to make decisions on code position and instruction selection for these checks. For example on ARM64 this enables certain scenarios to choose the
cbz
instruction overcmp/beq
, can lead to more compact code. It also allows some of the comparisons in the checks to be hoisted out of loops.