[Mesa-dev] [PATCH] i965: Implement proper workaround for Gen4 GPU CONSTANT_BUFFER hangs.

Ben Widawsky ben at bwidawsk.net
Tue Apr 14 21:03:20 PDT 2015


On Mon, 13 Apr 2015 01:26:57 -0700
Kenneth Graunke <kenneth at whitecape.org> wrote:

> I finally managed to dig up some information on our mysterious GPU
> hangs. A wiki page from the Crestline validation team mentions that
> they found a GPU hang in "Serious Sam 2" (on Windows) with remarkably
> similar conditions to the ones we've seen in Google Chrome and
> glmark2.
> 
> Apparently, if WM_STATE has "PS Use Source Depth" enabled, CC_STATE
> has most depth state disabled, and you issue a CONSTANT_BUFFER
> command and immediately draw, the depth interpolator makes a small
> mistake that leads to hangs.
> 
> Most of the traces I looked at contained a CONSTANT_BUFFER packet
> immediately followed by 3DPRIMITIVE, or at least very few packets.
> It appears they also have "PS Use Source Depth" enabled - either at
> the hang, or a little before it.  So I think this is our bug.
> 
> The workaround is to emit a non-pipelined state packet after issuing a
> CONSTANT_BUFFER packet.  This is really similar to the workaround I
> developed in commit c4fd0c9052dd391d6f2e9bb8e6da209dfc7ef35b.
> 
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>

Reviewed-by: Ben Widawsky <ben at bwidawsk.net> (you're welcome :D)

> ---
>  src/mesa/drivers/dri/i965/brw_curbe.c | 39
> +++++++++++++++++++++++------------ 1 file changed, 26 insertions(+),
> 13 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_curbe.c
> b/src/mesa/drivers/dri/i965/brw_curbe.c index e45e2ab..a773a89 100644
> --- a/src/mesa/drivers/dri/i965/brw_curbe.c
> +++ b/src/mesa/drivers/dri/i965/brw_curbe.c
> @@ -285,19 +285,6 @@ brw_upload_constant_buffer(struct brw_context
> *brw) */
>  
>  emit:
> -   /* Work around mysterious 965 hangs that appear to happen if you
> do
> -    * two 3DPRIMITIVEs with only a CONSTANT_BUFFER inbetween.  If we
> -    * haven't already flushed for some other reason, explicitly do
> so.
> -    *
> -    * We've found no documented reason why this should be necessary.
> -    */
> -   if (brw->gen == 4 && !brw->is_g4x &&
> -       (brw->ctx.NewDriverState & (BRW_NEW_BATCH | BRW_NEW_PSP)) ==
> 0) {
> -      BEGIN_BATCH(1);
> -      OUT_BATCH(MI_FLUSH);
> -      ADVANCE_BATCH();
> -   }
> -
>     /* BRW_NEW_URB_FENCE: From the gen4 PRM, volume 1, section 3.9.8
>      * (CONSTANT_BUFFER (CURBE Load)):
>      *
> @@ -317,6 +304,31 @@ emit:
>  		(brw->curbe.total_size - 1) +
> brw->curbe.curbe_offset); }
>     ADVANCE_BATCH();
> +
> +   /* Work around a Broadwater/Crestline depth interpolator bug.
> The following
> +    * sequence will cause GPU hangs:
> +    *
> +    * 1. Change state so that all depth related fields in CC_STATE
> are disabled,
> +    *    and in WM_STATE, only "PS Use Source Depth" is enabled.
> +    * 2. Emit a CONSTANT_BUFFER packet.
> +    * 3. Draw via 3DPRIMITIVE.
> +    *
> +    * The recommended workaround is to emit a non-pipelined state
> change after
> +    * emitting CONSTANT_BUFFER, in order to drain the windowizer
> pipeline.
> +    *
> +    * We arbitrarily choose 3DSTATE_GLOBAL_DEPTH_CLAMP_OFFSET (as
> it's small),
> +    * and always emit it when "PS Use Source Depth" is set.  We
> could be more
> +    * precise, but the additional complexity is probably not worth
> it.
> +    *
> +    * BRW_NEW_FRAGMENT_PROGRAM
> +    */
> +   const struct gl_program *fp = &brw->fragment_program->Base;
> +   if (brw->gen == 4 && !brw->is_g4x && (fp->InputsRead & (1 <<
> VARYING_SLOT_POS))) {
> +      BEGIN_BATCH(2);
> +      OUT_BATCH(_3DSTATE_GLOBAL_DEPTH_OFFSET_CLAMP << 16 | (2 - 2));
> +      OUT_BATCH(0);
> +      ADVANCE_BATCH();
> +   }
>  }
>  
>  const struct brw_tracked_state brw_constant_buffer = {
> @@ -324,6 +336,7 @@ const struct brw_tracked_state
> brw_constant_buffer = { .mesa = _NEW_PROGRAM_CONSTANTS,
>        .brw  = BRW_NEW_BATCH |
>                BRW_NEW_CURBE_OFFSETS |
> +              BRW_NEW_FRAGMENT_PROGRAM |
>                BRW_NEW_FS_PROG_DATA |
>                BRW_NEW_PSP | /* Implicit - hardware requires this,
> not used above */ BRW_NEW_URB_FENCE |



More information about the mesa-dev mailing list