[Mesa-dev] i965: enable resource streamer gather constants for UBOs
Abdiel Janulgue
abdiel.janulgue at linux.intel.com
Tue Apr 28 13:07:57 PDT 2015
This patch series enables resource streamer gather constants for UBOs.
With this feature, we treat UBO fetches as push constants instead of
pull. The resource streamer hardware makes it possible to gather and
pack easily with minimal overhead non-contiguous blocks of constant
data from an arbitrary buffer object as is in the case for UBOs sources
so the push constant state can treat the gathered constants as one GRF
block. I've initially targeted UBOs but the same idea can be
theoretically applied to any scattered uniform fetch as well - which I
plan to focus on next.
Mostly tested on Haswell, v2 has been incubating for some time and I
believe I've ironed out most of the major issues on the fs-backend. All
piglit tests for fragment shaders are passing. The vec4 backend still
needs some additional fine-tuning but it passes all vertex and geometry
shader piglit tests as well except gs-mat4x3. I've added a new
environment flag to selectively enable which shader stages to optimize.
Initial posting here if someone needs the original overview of the
series:
http://lists.freedesktop.org/archives/mesa-dev/2015-January/073594.html
Entire series lives here:
git://people.freedesktop.org/~abj/mesa:rs_gather_constants_NIR
Below are some real-world results from Unreal Engine 4 demos which
feature heavy UBO usage. The benchmark enabled use of gather constants
only for the fragment shaders.
EffectsCave (NIR disabled):
x fs gather constants disabled
+ fs gather constants enabled
N Min Max Median Avg Stddev
x 10 4.6008 4.83961 4.80967 4.791587 0.06943449
+ 10 5.05152 5.14954 5.11507 5.106432 0.031042147
Difference at 95.0% confidence
0.314845 ± 0.0505323
6.57079% ± 1.0546%
EffectsCave (NIR enabled):
x fs gather constants disabled
+ fs gather constants enabled
N Min Max Median Avg Stddev
x 10 3.99146 4.26072 4.19591 4.157199 0.093623634
+ 10 4.51396 4.59149 4.58185 4.574359 0.022251777
Difference at 95.0% confidence
0.41716 ± 0.0639358
10.0346% ± 1.53795%
Reflections Subway (NIR disabled):
x fs gather constants disabled
+ fs gather constants enabled
N Min Max Median Avg Stddev
x 10 6.64539 7.28898 7.11371 7.083675 0.19290418
+ 10 7.58844 7.66247 7.64003 7.632628 0.022702317
Difference at 95.0% confidence
0.548953 ± 0.129049
7.74955% ± 1.82178%
Reflections Subway (NIR enabled):
x fs gather constants disabled
+ fs gather constants enabled
N Min Max Median Avg Stddev
x 10 6.03644 6.19722 6.08858 6.097111 0.062671415
+ 10 6.30447 6.4363 6.35115 6.358372 0.043168601
Difference at 95.0% confidence
0.261261 ± 0.0505605
4.285% ± 0.829254%
What's changed since initial posting:
* Lots of squashed patches (~50 --> ~30)!
* Use environment variable INTEL_UBO_GATHER=vs,fs,gs to selectively enable
which shader stage to optimize with this feature.
* NIR support for the fs-backend.
* Remove unrelated fine-grained uniform support which I'll resubmit in a
separate patch series.
Dependencies:
* You'll need the i915 kernel driver which enables the resource streamer. I
plan to submit this in a separate patch series to the i915 mailing list:
git://people.freedesktop.org/~abj/linux:intel_resource_streamer_2
* libdrm with updated headers:
git://people.freedesktop.org/~abj/libdrm:libdrm_rs
Patch overview:
Patches 1 -5: Enables core resource streamer functionality and
hardware-generated binding tables
Patches 6 -10: Switches on the hardware bits for gather push constants
Patches 11-16: Core compiler support
Patches 17-20: Support for original i965 fs backend
Patches 19: Support for NIR fs backend
Patches 21-23: Support for vec4 backend
Patches 24-26: Required state setup and workarounds
Patches 29: Switch on push constants whenever we have UBO entries.
Signed-off-by: Abdiel Janulgue <abdiel.janulgue at linux.intel.com>
---
src/glsl/nir/nir_types.cpp | 11 ++
src/glsl/nir/nir_types.h | 4 +
.../drivers/dri/i965/brw_binding_tables.c | 180 +++++++++++++++++-
src/mesa/drivers/dri/i965/brw_context.c | 41 ++++
src/mesa/drivers/dri/i965/brw_context.h | 36 ++++
src/mesa/drivers/dri/i965/brw_defines.h | 47 +++++
src/mesa/drivers/dri/i965/brw_fs.cpp | 71 ++++++-
src/mesa/drivers/dri/i965/brw_fs.h | 6 +
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 59 ++++++
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 86 ++++++++-
src/mesa/drivers/dri/i965/brw_gs.c | 15 ++
src/mesa/drivers/dri/i965/brw_program.c | 5 +
src/mesa/drivers/dri/i965/brw_shader.cpp | 4 +-
src/mesa/drivers/dri/i965/brw_shader.h | 11 ++
src/mesa/drivers/dri/i965/brw_state.h | 19 +-
src/mesa/drivers/dri/i965/brw_state_upload.c | 9 +-
src/mesa/drivers/dri/i965/brw_vec4.cpp | 62 ++++--
src/mesa/drivers/dri/i965/brw_vec4.h | 3 +
.../drivers/dri/i965/brw_vec4_visitor.cpp | 80 ++++++++
src/mesa/drivers/dri/i965/brw_vs.c | 18 ++
src/mesa/drivers/dri/i965/brw_wm.c | 18 ++
.../drivers/dri/i965/brw_wm_surface_state.c | 6 +
src/mesa/drivers/dri/i965/gen6_gs_state.c | 2 +-
src/mesa/drivers/dri/i965/gen6_vs_state.c | 39 +++-
src/mesa/drivers/dri/i965/gen6_wm_state.c | 2 +-
src/mesa/drivers/dri/i965/gen7_blorp.cpp | 1 +
src/mesa/drivers/dri/i965/gen7_disable.c | 4 +
src/mesa/drivers/dri/i965/gen7_vs_state.c | 73 ++++++-
src/mesa/drivers/dri/i965/gen7_wm_state.c | 2 +-
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 7 +-
src/mesa/drivers/dri/i965/intel_reg.h | 3 +
31 files changed, 881 insertions(+), 43 deletions(-)
More information about the mesa-dev
mailing list