[Mesa-dev] Value Range Propagation in NIR (GSoC)

Sat Apr 11 12:35:30 PDT 2015

2015-04-08 18:03 GMT+02:00 Jason Ekstrand <jason at jlekstrand.net>:
> On Tue, Apr 7, 2015 at 4:52 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>> Hi Thomas,
>>
>> Thanks for submitting a proposal! Some comments/answers below.
>>
>> On Tue, Apr 7, 2015 at 3:34 PM, Thomas Helland
>> <thomashelland90 at gmail.com> wrote:
>>> Hi,
>>>
>>> For those that don't know I've submitted a proposal for this years GSoC.
>>> I've proposed to implement value range propagation and loop unrolling in
>>> NIR.
>>> Since I'm no expert on compilers I've read up on some litterature:
>>>
>>> I started with "Constant propagation with conditional branches"  (thanks
>>> Connor).
>>> This paper describes an algorithm, "sparse conditional constant
>>> propagation",
>>> that seems to be the defacto standard in compilers today.
>>>
>>> I also found the paper;
>>> "Accurate static branch prediction by value range propagation " (VRP).
>>> This describes a value range propagation implementation based on SCCP.
>>> (This also allows one to set heuristics to calculate educated guesses for
>>> the
>>> probability of a certain branch, but that's probably more than we're
>>> interested in.)
>>
>> Thanks for mentioning that... I had forgotten the name of that paper.
>> You're right in that the branch probability stuff isn't too useful for
>> us. Also, it raises an important issue about back-edges from phi
>> nodes; they present a more sophisticated method to handle it, but I
>> think that for now we can just force back edges to have an infinite
>> range unless they're constant.
>>
>>>
>>> There is also a GCC paper (with whatever licensing issues that may apply);
>>> "A propagation engine for GCC".
>>> They have a shared engine for doing all propagation passes.
>>> It handles the worklists, and the logic to traverse these.
>>> The implementing passes then supply callbacks to define the lattice rules.
>>> They reply back if the instruction was interesting or not,
>>> and the propagation engine basically handles the rest.
>>>
>>> Maybe that's an interesting solution? Or it might not be worth the hassle?
>>> We already have copy propagation, and with value range propagation
>>> we probably don't want separate constant propagation?
>>> (I'm hoping to write the pass so that it handles both constants and value
>>> ranges.)
>>
>> Yes, constant propagation probably won't be so useful once we have value
>> range propagation; the former is a special case of the latter. Note
>> that we have a nifty way of actually doing the constant folding
>> (nir_constant_expressions.py and nir_constant_expressions.h), which
>> you should still use if all the inputs are constant.
>
> When I started taking a stab at range propagation, I started by trying
> to extend the constant folding framework.  I had a patch
> (http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/nir-minmax)
> but it doesn't do nearly as much as I remembered.  I don't know if
> it's practical to try and extend it or if we're better off just
> hand-rolling whatever we do for range handling.
>

I have a feeling it might be easier to start anew based on SCCP.
While it is hard to get a understanding of how things work
when reading a research paper, reading and understanding
someone else's code is not that easy either.
I will probably get a better understanding of the code and all
its quirks if I implement if from the bottom up.
I'll take a look at the patch though, to get a understanding of
how such a pass would work in NIR.

>>> The GCC guys have used this engine to get copy propagation that propagates
>>> copies accross conditionals, maybe this makes such a solution more
>>> interesting?
>>
>> I'm not so sure how useful such a general framework will be. Constant
>> propagation that handles back-edges seems interesting, but I'm not
>> sure it's worth the time to implement something this general as a
>> first pass.
>
> Agreed.  Let's just get it working first.

That makes three then :)

>
>>>
>>> Connor: I just remembered you saying something about your freedesktop
>>> git repo, so I poked around some and found that you have already done
>>> some work on VRP based on SCCP? How far did you get?
>>
>> I started on it, but then I realized that the approach I was using was
>> too cumbersome/complicated so I don't think what I have is too useful.
>> Feel free to work on it yourself, although Jason and I have discussed
>> it so we have some ideas of how to do it. I've written a few notes on
>> this below that you may find useful.
>>
>> - I have a branch I created while working on VRP that you'll probably
>> find useful: http://cgit.freedesktop.org/~cwabbott0/mesa/log/?h=nir-worklist
>> . The first two commits are already in master, but the last two should
>> be useful for implementing SCCP/VRP (although they'll need to be
>> rebased, obviously).
>>
>> - There's a comment in the SCCP paper (5.3, Nodes versus Edges) that
>> says: "An alternative way of implementing this would be to add nodes
>> to the
>> graph and then associate an ExecutableFlag with each node. An
>> additional node must be inserted between any node that has more than
>> one immediate successor and any successor node that has more than one
>> immediate predecessor." I think this procedure is what's usually
>> called "splitting critical edges"; in NIR, thanks to the structured
>> control flow, there are never any critical edges except for one edge
>> case you don't really have to care about too much (namely, an infinite
>> loop with one basic block) and therefore you can just use the basic
>> block worklist that I added in the branch mentioned above, rather than
>> a worklist of basic block edges as the paper describes.
>>
>> - The reason my pass was becoming so cumbersome was because I was
>> trying to solve two problems at once. First, there's actually
>> propagating the ranges. Then, there's taking into account restrictions
>> on range due to branch predicates. For example, if I have something
>> like:
>>
>> if (x > 0) {
>>     y = max(x, 0);
>> }
>>
>> then since the use of x is dominated by the then-branch of the if, x
>> has to be greater than 0 and we can optimize it. This is a little
>> contrived, but we have seen things like:
>>
>> if (foo)
>>     break;
>>
>> /* ... */
>>
>> if (foo)
>>     break;
>>
>> in the wild, where we could get rid of the redundant break using this
>> analysis by recognizing that since the second condition is dominated
>> by the else-branch of the first, foo must be false there. I was trying
>> to handle this by storing multiple lattice entries for the same SSA
>> value, but it was becoming too messy. Instead, we can solve the first
>> problem normally, and then to solve the second problem we can create
>> new SSA values, using the standard SSA construction algorithm, any
>> time where after a certain point the range of a value is restricted
>> (namely, the condition of a branch or the index of an array
>> dereference). In the first example, we would create a new value x2:
>>
>> if (x > 0) {
>>     x2 = x;
>>     y = max(x2, 0);
>> }
>
> This should be easy enough to do.  We already have a
> nir_ssa_def_rewrite_uses function.  We would just have to extend it to
> nir_ssa_def_rewrite_uses_dominated_by to handle this case but it
> shouldn't be hard.
>
>> and the VRP pass will keep track of "special" copies like the one from
>> x to x2 that add restrictions on the range. After everything is
>> finished, copy propagation and DCE will clean up the extra copies
>> created. There's a paper on this somewhere, but I don't quite remember
>> the name of it. I'm not sure if you'll be able to get to this over the
>> summer, but I thought I'd explain it in case you were interested.
>
> When I was thinking about it, I had a convoluted scheme involving a
> stack of "contexts" to handle this situation.  I was also trying to
> handle propagating the range information implied by a select to its
> arguments.  I'd love to chat about it, but It never ended up being
> code so I'll leave it alone unless you really want to discuss my
> hair-brained ideas.
>

I think what you are describing is kinda the implementation
that is described in the VRP report?
It stores possible ranges for a value, and keeps control of the
condition that sets it and does a new pass around.
It uses this to iterate to a steady state for possibilities for each
branch in a conditional.

>>> If we just want to make an SCCP inspired VRP pass then Connor has work in
>>> progress.
>>> Finishing that, and loop unrolling, may not be enough work for GSoC?
>>
>> I'm not sure... on the one hand, there's enough here that it may take
>> the entire summer for you to do. On the other hand, it's possible that
>> you'll be able to finish it with enough time left over, and there are
>> plenty of other things you'd be able to do. It depends on several
>> factors, and no one has a crystal ball here. I'm not experienced
>> enough with GSoC to be able to give you a recommendation, so it would
>> be nice for someone more experienced to give their opinion.
>
> Yeah, let's just go with what we have.  I'm ok with calling range
> propagation + loop unrolling the entire GSoC project.  If you have
> extra time and want to work on something else, there's plenty to do
> and no one will stop you.

Affirmative =)

> --Jason