[Mesa-dev] [PATCH 6/7] i965/cs: Implement brw_emit_gpgpu_walker

Kenneth Graunke kenneth at whitecape.org
Mon Apr 27 19:02:38 PDT 2015


On Friday, April 24, 2015 04:33:43 PM Jordan Justen wrote:
> Tested on Ivybridge, Haswell and Broadwell.
> 
> Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
> ---
>  src/mesa/drivers/dri/i965/brw_compute.c | 39 ++++++++++++++++++++++++++++++++-
>  src/mesa/drivers/dri/i965/brw_defines.h |  1 +
>  2 files changed, 39 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_compute.c b/src/mesa/drivers/dri/i965/brw_compute.c
> index baed701..06ef448 100644
> --- a/src/mesa/drivers/dri/i965/brw_compute.c
> +++ b/src/mesa/drivers/dri/i965/brw_compute.c
> @@ -31,12 +31,49 @@
>  #include "brw_draw.h"
>  #include "brw_state.h"
>  #include "intel_batchbuffer.h"
> +#include "brw_defines.h"
>  
>  
>  static void
>  brw_emit_gpgpu_walker(struct brw_context *brw, const GLuint *num_groups)
>  {
> -   _mesa_problem(&brw->ctx, "TODO: implement brw_emit_gpgpu_walker");
> +   const struct brw_cs_prog_data *prog_data = brw->cs.prog_data;
> +
> +   const unsigned simd_size = prog_data->simd_size;
> +   unsigned group_size = prog_data->local_size[0] *
> +      prog_data->local_size[1] * prog_data->local_size[2];
> +   unsigned thread_width_max =
> +      (group_size + simd_size - 1) / simd_size;
> +
> +   uint32_t right_mask = (1u << simd_size) - 1;
> +   const unsigned right_non_aligned = group_size & (simd_size - 1);
> +   if (right_non_aligned != 0)
> +      right_mask >>= (simd_size - right_non_aligned);

I think this is equvalent to:

uint32_t right_mask = (1u << (simd_size - (group_size % simd_size))) - 1;

which might be a bit simpler...

> +   BEGIN_BATCH(dwords);
> +   OUT_BATCH(GPGPU_WALKER << 16 | (dwords - 2));

I was going to suggest splitting this into separate Gen8+ and Gen7
blocks, but now that I look at the code...these two are slightly
different indirect handling, and the later one is just a DWord of MBZ,
so...it's not really that different.  I think what you have is fine :)

> +   uint32_t dwords = brw->gen < 8 ? 11 : 15;
> +   OUT_BATCH(0);
> +   if (brw->gen >= 8) {
> +      OUT_BATCH(0);
> +      OUT_BATCH(0);
> +   }
> +   assert(thread_width_max <= brw->max_cs_threads);
> +   OUT_BATCH(((simd_size == 8) ? 0 : 1) << 30 |

You might want to write this as ((simd_size / 8) - 1).  That will work
for SIMD8/16/32.

Topi would probably suggest using SET_FIELD, i.e.

#define BRW_GPGPU_SIMD_SIZE_SHIFT     30
#define BRW_GPGPU_SIMD_SIZE_MASK      INTEL_MASK(31, 30)

SET_FIELD((simd_size / 8) - 1, BRW_GPGPU_SIMD_SIZE)

It's probably a good idea here too.

> +             (thread_width_max - 1));

Don't you need to set the thread height/depth maximums as well?
I'm not really sure how this works.

> +   OUT_BATCH(0);

It'd be nice to label the 0's, i.e.

   OUT_BATCH(0);             /* Thread Group ID Starting X */
   OUT_BATCH(num_groups[0]); /* Thread Group ID X Dimension */

With those changes, the whole series is:
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

I haven't verified that these execution masks are really what you want.
You know more about this than I do. :)

> +   if (brw->gen >= 8)
> +      OUT_BATCH(0);
> +   OUT_BATCH(num_groups[0]);
> +   OUT_BATCH(0);
> +   if (brw->gen >= 8)
> +      OUT_BATCH(0);
> +   OUT_BATCH(num_groups[1]);
> +   OUT_BATCH(0);
> +   OUT_BATCH(num_groups[2]);
> +   OUT_BATCH(right_mask);
> +   OUT_BATCH(0xffffffff);
> +   ADVANCE_BATCH();
>  }
>  
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h
> index 36f46af..cd25511 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -2451,5 +2451,6 @@ enum brw_wm_barycentric_interp_mode {
>  
>  #define MEDIA_VFE_STATE                         0x7000
>  #define MEDIA_INTERFACE_DESCRIPTOR_LOAD         0x7002
> +#define GPGPU_WALKER                            0x7105
>  
>  #endif
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150427/af80e576/attachment.sig>


More information about the mesa-dev mailing list