[Mesa-dev] [PATCH 10/10] i965/fs: Combine pixel center calculation into one inst.
Jason Ekstrand
jason at jlekstrand.net
Fri Apr 17 11:59:31 PDT 2015
On Fri, Apr 17, 2015 at 11:58 AM, Matt Turner <mattst88 at gmail.com> wrote:
> On Fri, Apr 17, 2015 at 11:56 AM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>> On Tue, Apr 14, 2015 at 4:15 PM, Matt Turner <mattst88 at gmail.com> wrote:
>>> The X and Y values come interleaved in g1 (.4-.11 inclusive), so we can
>>> calculate them together with a single add(32) instruction on some
>>> platforms like Broadwell and newer or in SIMD8 elsewhere.
>>>
>>> Note that I also moved the PIXEL_X/PIXEL_Y virtual opcodes from before
>>> LINTERP to after it. That's because the writes_accumulator_implicitly()
>>> function in backend_instruction tests for <= LINTERP for determining
>>> whether the instruction indeed writes the accumulator implicitly. The
>>> old FS_OPCODE_PIXEL_X/Y emitted ADD instructions, which did, but the new
>>> opcodes just emit MOVs, which don't. It doesn't matter, since we don't
>>> use these opcodes on Gen4/5 anymore, but in the case that we do...
>>>
>>> On Broadwell:
>>> total instructions in shared programs: 7192355 -> 7186224 (-0.09%)
>>> instructions in affected programs: 1190700 -> 1184569 (-0.51%)
>>> helped: 6131
>>>
>>> On Haswell:
>>> total instructions in shared programs: 6155979 -> 6152800 (-0.05%)
>>> instructions in affected programs: 652362 -> 649183 (-0.49%)
>>> helped: 3179
>>> ---
>>> src/mesa/drivers/dri/i965/brw_defines.h | 2 +
>>> src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 10 ++++
>>> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 71 ++++++++++++++++++--------
>>> 3 files changed, 63 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h
>>> index 1afa34e..f99da12 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_defines.h
>>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
>>> @@ -925,6 +925,8 @@ enum opcode {
>>> FS_OPCODE_DDY_FINE,
>>> FS_OPCODE_CINTERP,
>>> FS_OPCODE_LINTERP,
>>> + FS_OPCODE_PIXEL_X,
>>> + FS_OPCODE_PIXEL_Y,
>>> FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD,
>>> FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7,
>>> FS_OPCODE_VARYING_PULL_CONSTANT_LOAD,
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>>> index dba3286..7bc97a6 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>>> @@ -1909,6 +1909,16 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
>>> case FS_OPCODE_LINTERP:
>>> generate_linterp(inst, dst, src);
>>> break;
>>> + case FS_OPCODE_PIXEL_X:
>>> + assert(src[0].type == BRW_REGISTER_TYPE_UW);
>>> + src[0].subnr = 0 * type_sz(src[0].type);
>>> + brw_MOV(p, dst, stride(src[0], 8, 4, 1));
>>
>> Why do we need the special opcode? Can we not do that stride with
>> what we already have in the FS compiler?
>
> Nope, we only have horizontal stride in fs_reg (the 'stride' member).
Ok. Good enough for me.
More information about the mesa-dev
mailing list