[Mesa-dev] [PATCH 10/10] i965/fs: Combine pixel center calculation into one inst.
Matt Turner
mattst88 at gmail.com
Fri Apr 17 11:58:45 PDT 2015
On Fri, Apr 17, 2015 at 11:56 AM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> On Tue, Apr 14, 2015 at 4:15 PM, Matt Turner <mattst88 at gmail.com> wrote:
>> The X and Y values come interleaved in g1 (.4-.11 inclusive), so we can
>> calculate them together with a single add(32) instruction on some
>> platforms like Broadwell and newer or in SIMD8 elsewhere.
>>
>> Note that I also moved the PIXEL_X/PIXEL_Y virtual opcodes from before
>> LINTERP to after it. That's because the writes_accumulator_implicitly()
>> function in backend_instruction tests for <= LINTERP for determining
>> whether the instruction indeed writes the accumulator implicitly. The
>> old FS_OPCODE_PIXEL_X/Y emitted ADD instructions, which did, but the new
>> opcodes just emit MOVs, which don't. It doesn't matter, since we don't
>> use these opcodes on Gen4/5 anymore, but in the case that we do...
>>
>> On Broadwell:
>> total instructions in shared programs: 7192355 -> 7186224 (-0.09%)
>> instructions in affected programs: 1190700 -> 1184569 (-0.51%)
>> helped: 6131
>>
>> On Haswell:
>> total instructions in shared programs: 6155979 -> 6152800 (-0.05%)
>> instructions in affected programs: 652362 -> 649183 (-0.49%)
>> helped: 3179
>> ---
>> src/mesa/drivers/dri/i965/brw_defines.h | 2 +
>> src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 10 ++++
>> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 71 ++++++++++++++++++--------
>> 3 files changed, 63 insertions(+), 20 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h
>> index 1afa34e..f99da12 100644
>> --- a/src/mesa/drivers/dri/i965/brw_defines.h
>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
>> @@ -925,6 +925,8 @@ enum opcode {
>> FS_OPCODE_DDY_FINE,
>> FS_OPCODE_CINTERP,
>> FS_OPCODE_LINTERP,
>> + FS_OPCODE_PIXEL_X,
>> + FS_OPCODE_PIXEL_Y,
>> FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD,
>> FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7,
>> FS_OPCODE_VARYING_PULL_CONSTANT_LOAD,
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> index dba3286..7bc97a6 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> @@ -1909,6 +1909,16 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
>> case FS_OPCODE_LINTERP:
>> generate_linterp(inst, dst, src);
>> break;
>> + case FS_OPCODE_PIXEL_X:
>> + assert(src[0].type == BRW_REGISTER_TYPE_UW);
>> + src[0].subnr = 0 * type_sz(src[0].type);
>> + brw_MOV(p, dst, stride(src[0], 8, 4, 1));
>
> Why do we need the special opcode? Can we not do that stride with
> what we already have in the FS compiler?
Nope, we only have horizontal stride in fs_reg (the 'stride' member).
More information about the mesa-dev
mailing list