[Mesa-dev] [PATCH 4/4] RFC: nir: add lowering for idiv/udiv/umod
Matt Turner
mattst88 at gmail.com
Wed Apr 1 11:11:19 PDT 2015
On Tue, Mar 31, 2015 at 6:44 PM, Rob Clark <robdclark at gmail.com> wrote:
> On Tue, Mar 31, 2015 at 9:03 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> So if this is not enough precision, maybe should state how large the
>> error can be?
>>
>
> tbh, if I knew what the error for this approach was, I would have
> included it. I'm not the original author, but this is based on
> nouveau codegen code (as mentioned in the comment). I guess it is
> better than converting to float and dividing and converting back, but
> worse than an iterative (ie. looping, ie. divergent flow control)
> approach. It is apparently enough to keep piglit happy.
>
> The original algo in nv50 lowering code is from
> 322bc7ed68ed92233c97168c036d0aa50c11a20e (ie. 'nv50/ir: import nv50
> target') which doesn't really give more clue about the origin..
>
> if anyone knows, I'm all ears and will add relevant links/info to comment..
FWIW: The DEC Alpha doesn't have an integer divide instruction, and
its architecture reference manual [0] suggests in section A.4.2:
> For the remaining cases, a table lookup on about a 1000-entry table and
> a multiply can give a linear approximation to 1/divisor that is accurate to 16 bits.
>
> Using this approximation, a multiply and a back-multiply and a subtract can
> generate one 16-bit quotient digit plus a 48-bit new partial dividend. Three
> more such steps can generate the full quotient. Having prior knowledge of the
> possible sizes of the divisor and dividend, normalizing away leading bytes of
> zeros, and performing an early-out test can reduce the average number of
> multiplies to about five (compared to a best case of one and a worst case of
> nine).
I read on IRC that at least some (radeon?) hardware has an instruction
like recip_int that might make this a compelling option, since you
wouldn't even need the table. I can probably dig up an implementation
if this sounds interesting.
Otherwise, I think it's a series of subtracts and shifts.
[0] http://download.majix.org/dec/alpha_arch_ref.pdf
More information about the mesa-dev
mailing list