[Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays
Benedikt Schemmer
ben at besd.de
Sun Apr 29 08:43:11 UTC 2018
Hi Gert,
couldn't resist at least to try what would happen if I enable register merge for radeonsi:
PERCENTAGE DELTAS Shaders SGPRs VGPRs SpillSGPR SpillVGPR PrivVGPR Scratch CodeSize MaxWaves Waits
piglit 80732 -0.16 % -0.02 % . . 0.87 % 0.86 % 0.04 % . .
----------------------------------------------------------------------------------------------------------------------
All affected 513 -17.58 % -2.30 % . . 4.12 % 5.87 % 1.73 % 0.10 % .
----------------------------------------------------------------------------------------------------------------------
Total 80732 -0.16 % -0.02 % . . 0.87 % 0.86 % 0.04 % . .
I had already removed the defines around the debug code so thats also happily outputting data.
fails with two piglit shaders:
<code>
[require]
GLSL >= 3.30
[fragment shader]
// [config]
// expect_result: pass
// glsl_version: 3.30
// require_extensions: GL_ARB_bindless_texture GL_ARB_shader_image_load_store
// [end config]
#version 330
#extension GL_ARB_bindless_texture: require
#extension GL_ARB_shader_image_load_store: enable
#extension GL_ARB_arrays_of_arrays: enable
struct s {
writeonly image2D img[3][2];
int y;
};
void main()
{
s a[2][4];
imageStore(a[0][0].img[0][0], ivec2(0, 0), vec4(1, 2, 3, 4));
}
</code>
and
<code>
[require]
GLSL >= 3.30
[fragment shader]
// [config]
// expect_result: pass
// glsl_version: 3.30
// require_extensions: GL_ARB_bindless_texture
// [end config]
#version 330
#extension GL_ARB_bindless_texture: require
#extension GL_ARB_arrays_of_arrays: enable
struct s {
sampler2D tex[3][2];
int y;
};
out vec4 color;
void main()
{
s a[2][4];
color = texture2D(a[0][0].tex[0][0], vec2(0, 0));
}
</code>
Real world is a little different:
Max Increase:
SGPRS: 72 -> 96 (33.33 %) (in shaders/cat/1787.shader_test)
VGPRS: 64 -> 84 (31.25 %) (in shaders/dirtrally/0859b69789591d7046e211400b1edd9a7cfca734_742.shader_test)
Spilled SGPRs: 0 -> 16 (0.00 %) (in shaders/deusex_mankind/d64e2084204e29749639e8fbd9a1e507c7e5e1dd_6840.shader_test)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 24 -> 32 (33.33 %) (in shaders/deusex_mankind/28cac87049d8c833e72296a5a02ea6118f1144e5_5876.shader_test)
Scratch size: 28 -> 36 (28.57 %) dwords per thread (in shaders/deusex_mankind/28cac87049d8c833e72296a5a02ea6118f1144e5_5876.shader_test)
Code Size: 6988 -> 8036 (15.00 %) bytes (in shaders/cat/1847.shader_test)
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 5 -> 7 (40.00 %) (in shaders/ruiner/0967c5fce7fc456496b1cfa25fbb1d1c4dcf9bed_2958.shader_test)
Wait states: 0 -> 0 (0.00 %)
Max Decrease:
SGPRS: 104 -> 64 (-38.46 %) (in shaders/deusex_mankind/480ddf21b1076d36f9ffd9911389656b5d8e12cb_2878.shader_test)
VGPRS: 44 -> 36 (-18.18 %) (in shaders/ruiner/0967c5fce7fc456496b1cfa25fbb1d1c4dcf9bed_2958.shader_test)
Spilled SGPRs: 19 -> 0 (-100.00 %) (in shaders/deusex_mankind/0749c9ae23417f918c7286fe502ff5de4cb8e1a0_3276.shader_test)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 17576 -> 17276 (-1.71 %) bytes (in shaders/ruiner/75b96ff36f5328b9ff9366f0d0fd58a1046f51bc_3053.shader_test)
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 8 -> 5 (-37.50 %) (in shaders/deusex_mankind/8dabec49e5b6c3b1cbcbaee194eff69f164d72f4_3968.shader_test)
Wait states: 0 -> 0 (0.00 %)
PERCENTAGE DELTAS Shaders SGPRs VGPRs SpillSGPR SpillVGPR PrivVGPR Scratch CodeSize MaxWaves Waits
0ad 6 . . . . . . . . .
aer 590 . 0.26 % -20.00 % . . . 0.34 % . .
alien_isolation 1414 . . . . . . . . .
anholt 10 . . . . . . . . .
bioshock_infinite 2581 -0.02 % 0.03 % . . . . 0.13 % . .
blackmesa 584 . . . . . . . . .
cat 573 -0.06 % -0.12 % . . . . 0.20 % 0.05 % .
csgo 1392 . . -0.88 % . . . -0.03 % . .
deadisland_definitive 1776 0.06 % . . . . . 0.15 % 0.01 % .
deadisland_original 11602 . . . . . . 0.05 % . .
deadisland_riptide_.. 293 -0.06 % 0.06 % . . . . 0.32 % . .
deusex_mankind 5051 0.08 % . -6.14 % . 33.33 % 28.57 % 0.19 % -0.01 % .
dirtrally 787 . 0.64 % 0.62 % . . . 0.30 % -0.31 % .
dolphin 22 . . . . . . . . .
dyinglight 4012 . 0.05 % . . . . 0.34 % -0.01 % .
eurotruck2 216 . . . . . . . . .
f1_2015 746 -0.04 % -0.02 % 2.72 % . . . 0.14 % . .
glamor 16 -2.33 % . . . . . 3.97 % . .
hl2ep1 294 . . . . . . . . .
hl2ep2 154 . . . . . . . . .
hl2lostcoast 66 . . . . . . . . .
hlsl3 582 . . . . . . -0.14 % . .
humus-celshading 4 . . . . . . . . .
humus-domino 6 . . . . . . . . .
humus-dynamicbranching 24 . . . . . . . . .
humus-hdr 10 . . . . . . . . .
humus-portals 2 . . . . . . . . .
humus-volumetricfog.. 6 . . . . . . . . .
kerbal 1016 . 0.11 % . . . . 0.31 % . .
larago 664 . . . . . . 0.01 % . .
madmax 354 0.04 % -0.08 % . . . . -0.02 % 0.04 % .
metro2033redux 4410 . 0.05 % . . . . 0.06 % -0.04 % .
nexuiz 80 . . . . . . . . .
ruiner 685 -0.10 % -0.09 % . . . . 0.09 % 0.04 % .
sauerbraten 7 . . . . . . . . .
serioussam2017 736 0.03 % -0.07 % 7.09 % . . . 0.05 % 0.06 % .
soma 436 . . . . . . . . .
specops 1814 . . . . . . 0.35 % . .
stellaris 434 . . . . . . 0.11 % . .
supertuxkart 4 . . . . . . . . .
talos 762 -0.02 % . 0.09 % . . . 0.01 % . .
tesseract 430 . . . . . . . . .
tombraider 1012 0.21 % 0.31 % . . . . 0.22 % -0.16 % .
total_war_shogun_2 176 -0.21 % . -2.10 % . . . -0.05 % . .
total_war_warhammer 218 . 0.06 % . . . . 0.72 % -0.06 % .
ubershaders 54 -2.04 % 0.20 % . . . . 1.38 % . .
ug_gettysburg 149 . . . . . . . . .
unigine_heaven 226 . . . . . . . . .
unigine_superposition 733 -0.08 % 0.04 % . . . . 0.02 % . .
unigine_valley 288 . . . . . . . . .
unity 72 . . . . . . 0.04 % . .
w40kdawn2 421 . . . . . . -0.20 % . .
w40kdawn3 164 0.36 % . . . . . . . .
warsow 176 . . . . . . . . .
warzone2100 4 . . . . . . . . .
witcher2 928 -0.07 % 0.06 % . . . . 0.04 % . .
x3_albion 641 . . . . . . . . .
xblades 208 . . . . . . 0.42 % . .
xcom 1020 -0.10 % . . . . . 0.28 % . .
xcom2 1439 . . . . . . . . .
yofrankie 82 . . . . . . . . .
----------------------------------------------------------------------------------------------------------------------
All affected 6394 0.04 % 0.16 % 0.46 % . 7.41 % 6.67 % 0.51 % -0.09 % .
----------------------------------------------------------------------------------------------------------------------
Total 52662 . 0.03 % 0.26 % . 1.34 % 1.09 % 0.13 % -0.01 % .
If theres an easy way to figure out when your code makes it worse and when its an improvement this would be really nice.
Really interesting.
Cheers, Benedikt
Am 29.04.2018 um 09:55 schrieb Gert Wollny:
> Hello Benedikt,
>
> Am Sonntag, den 29.04.2018, 00:06 +0200 schrieb Benedikt Schemmer:
>> Hi Gert
>>
>> Am 28.04.2018 um 23:51 schrieb Gert Wollny:
>>> Am Samstag, den 28.04.2018, 22:43 +0200 schrieb Benedikt Schemmer:
>>>> The patches apply cleanly, however I just did a shader-db test
>>>> run
>>>> and can't find a difference with your patch
>>>> applied, am I doing something wrong?
>>>
>>> AFAIK radeonsi doesn't use the register-merge optimizer in TGSI.
>>>
>>
>> Ah, ok. Was wondering why your debug code doesn't output anything.
>> Makes sense now ;)
> Not exactly, the reason there is no output is because -DNDEBUG is set.
> Without it the statistics should also be printed out on radeonsi, but
> thinking of it I should probably disable it when register_merge is not
> accessed, because without this the numbers will be inflated and don't
> have much meaning.
>
>> So is this useless on radeonsi?
> Indeed.
>
>> Seemed interesting to me.
> :) it certainly helps on r600
>
>
>>>> compile times went up though:
>>>
>>> This is strange, because "see above". Did you compile with debug
>>> information and c++11 or higher enables?
> ...
>>>
>>>
>> not intentionally:
>
> Then you should actually not run any code that this series adds to
> mesa. I checked again, apart from the debugging output nothing will
> ever be run if a drivers that report
> PIPE_SHADER_CAP_TGSI_SKIP_MERGE_REGISTERS != 0 (as does radeonsi).
>
> Best,
> Gert
>
More information about the mesa-dev
mailing list