[Mesa-dev] [PATCH 4/7] i965: Perform basic optimizations on the FIND_LIVE_CHANNEL opcode.
Kenneth Graunke
kenneth at whitecape.org
Thu Apr 30 09:10:00 PDT 2015
On Thursday, April 30, 2015 04:59:49 PM Francisco Jerez wrote:
> Matt Turner <mattst88 at gmail.com> writes:
>
> > On Fri, Feb 20, 2015 at 11:49 AM, Francisco Jerez <currojerez at riseup.net> wrote:
> >> ---
> >> src/mesa/drivers/dri/i965/brw_fs.cpp | 49 ++++++++++++++++++++++++++++++
> >> src/mesa/drivers/dri/i965/brw_fs.h | 1 +
> >> src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 1 +
> >> src/mesa/drivers/dri/i965/brw_vec4.cpp | 41 +++++++++++++++++++++++++
> >> src/mesa/drivers/dri/i965/brw_vec4.h | 1 +
> >> src/mesa/drivers/dri/i965/brw_vec4_cse.cpp | 1 +
> >> 6 files changed, 94 insertions(+)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> index d567c2b..4537900 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> @@ -2734,6 +2734,54 @@ fs_visitor::compute_to_mrf()
> >> }
> >>
> >> /**
> >> + * Eliminate FIND_LIVE_CHANNEL instructions occurring outside any control
> >> + * flow. We could probably do better here with some form of divergence
> >> + * analysis.
> >> + */
> >> +bool
> >> +fs_visitor::eliminate_find_live_channel()
> >> +{
> >> + bool progress = false;
> >> + unsigned depth = 0;
> >> +
> >> + foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
> >> + switch (inst->opcode) {
> >> + case BRW_OPCODE_IF:
> >> + case BRW_OPCODE_DO:
> >> + depth++;
> >> + break;
> >> +
> >> + case BRW_OPCODE_ENDIF:
> >> + case BRW_OPCODE_WHILE:
> >> + depth--;
> >> + break;
> >> +
> >> + case FS_OPCODE_DISCARD_JUMP:
> >> + /* This can potentially make control flow non-uniform until the end
> >> + * of the program.
> >> + */
> >> + depth++;
> >> + break;
Why not just "return progress" at this point? depth++ just makes it
effectively never do anything again, but continue iterating over the
rest of the program.
Otherwise, this all seems good to me. This patch is
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
> >> +
> >> + case SHADER_OPCODE_FIND_LIVE_CHANNEL:
> >> + if (depth == 0) {
> >> + inst->opcode = BRW_OPCODE_MOV;
> >> + inst->src[0] = fs_reg(0);
> >
> > How does this work if we're on a primitive edge and channel zero is
> > disabled? Does the EU not actually disable those channels except in
> > the framebuffer write or something?
> >
> Yes, exactly, samples outside the primitive (or samples discarded by the
> early depth/stencil test) have the corresponding execution mask bit
> enabled, otherwise it wouldn't be possible to calculate derivatives near
> the edge of a primitive. Whole subspans not covered by the primitive or
> dropped by early fragment tests are never dispatched to any PS thread,
> so channel "0" should always have valid contents except when disabled by
> control flow.
>
> >> + inst->sources = 1;
> >> + inst->force_writemask_all = true;
> >> + progress = true;
> >> + }
> >> + break;
> >> +
> >> + default:
> >> + break;
> >> + }
> >> + }
> >> +
> >> + return progress;
> >> +}
> >
> > I wouldn't mind if this was just part of opt_algebraic. Less code that
> > way and one fewer pass over the instruction list. What do you think,
> > Ken?
>
> I shortly considered doing that but finally decided not to because:
> - It's not an algebraic instruction.
> - The optimization is non-local unlike the other fully self-contained
> algebraic optimizations, as it needs to keep track of the control
> flow nesting level.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150430/564b1b74/attachment.sig>
More information about the mesa-dev
mailing list