[Mesa-dev] [PATCH 10/10] i965/fs: Combine pixel center calculation into one inst.

Jason Ekstrand jason at jlekstrand.net
Fri Apr 17 11:59:31 PDT 2015


On Fri, Apr 17, 2015 at 11:58 AM, Matt Turner <mattst88 at gmail.com> wrote:
> On Fri, Apr 17, 2015 at 11:56 AM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>> On Tue, Apr 14, 2015 at 4:15 PM, Matt Turner <mattst88 at gmail.com> wrote:
>>> The X and Y values come interleaved in g1 (.4-.11 inclusive), so we can
>>> calculate them together with a single add(32) instruction on some
>>> platforms like Broadwell and newer or in SIMD8 elsewhere.
>>>
>>> Note that I also moved the PIXEL_X/PIXEL_Y virtual opcodes from before
>>> LINTERP to after it. That's because the writes_accumulator_implicitly()
>>> function in backend_instruction tests for <= LINTERP for determining
>>> whether the instruction indeed writes the accumulator implicitly. The
>>> old FS_OPCODE_PIXEL_X/Y emitted ADD instructions, which did, but the new
>>> opcodes just emit MOVs, which don't. It doesn't matter, since we don't
>>> use these opcodes on Gen4/5 anymore, but in the case that we do...
>>>
>>> On Broadwell:
>>> total instructions in shared programs: 7192355 -> 7186224 (-0.09%)
>>> instructions in affected programs:     1190700 -> 1184569 (-0.51%)
>>> helped:                                6131
>>>
>>> On Haswell:
>>> total instructions in shared programs: 6155979 -> 6152800 (-0.05%)
>>> instructions in affected programs:     652362 -> 649183 (-0.49%)
>>> helped:                                3179
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_defines.h        |  2 +
>>>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 10 ++++
>>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 71 ++++++++++++++++++--------
>>>  3 files changed, 63 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h
>>> index 1afa34e..f99da12 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_defines.h
>>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
>>> @@ -925,6 +925,8 @@ enum opcode {
>>>     FS_OPCODE_DDY_FINE,
>>>     FS_OPCODE_CINTERP,
>>>     FS_OPCODE_LINTERP,
>>> +   FS_OPCODE_PIXEL_X,
>>> +   FS_OPCODE_PIXEL_Y,
>>>     FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD,
>>>     FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7,
>>>     FS_OPCODE_VARYING_PULL_CONSTANT_LOAD,
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>>> index dba3286..7bc97a6 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>>> @@ -1909,6 +1909,16 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
>>>        case FS_OPCODE_LINTERP:
>>>          generate_linterp(inst, dst, src);
>>>          break;
>>> +      case FS_OPCODE_PIXEL_X:
>>> +         assert(src[0].type == BRW_REGISTER_TYPE_UW);
>>> +         src[0].subnr = 0 * type_sz(src[0].type);
>>> +         brw_MOV(p, dst, stride(src[0], 8, 4, 1));
>>
>> Why do we need the special opcode?  Can we not do that stride with
>> what we already have in the FS compiler?
>
> Nope, we only have horizontal stride in fs_reg (the 'stride' member).
Ok.  Good enough for me.


More information about the mesa-dev mailing list