[Mesa-dev] Fwd: [PATCH] mesa/math: Allocate memory for GLmatrix elements and its inverse contiguously

Thu Apr 19 18:08:25 UTC 2018

---------- Forwarded message ----------
From: Vlad Golovkin <vlad.golovkin.mail at gmail.com>
Date: 2018-04-19 21:06 GMT+03:00
Subject: Re: [Mesa-dev] [PATCH] mesa/math: Allocate memory for
GLmatrix elements and its inverse contiguously
To: Thomas Helland <thomashelland90 at gmail.com>

2018-04-17 8:55 GMT+03:00 Thomas Helland <thomashelland90 at gmail.com>:
> Hi, and thanks for the patch =)
>
> Have you done any performance testing on this to verify it
> gives us a speedup of any kind? I'm asking because it seems like
> this might be something that a decent compiler should be able to do.
> Performance related patches, at least in core mesa, usually have
> some justification with benchmark numbers in the commit message.

Hi,
I examined the resulting assembly for these 3 functions and it turns
out that compiler wasn't merging these two blocks of memory into one
(which compiler does that?).
gcc tried to unroll memcpys to a series of movs which may seem to
partially defeat the purpose of this patch, but after copying the
block corresponding to m->m it had to switch destination and source
registers to the next block resulting in 2 wasted movs.
As a result we can save malloc and free call (in _math_matrix_ctr and
_math_matrix_dtr) and 2 movs (when compiler tries to avoid memcpy -
best case) or 1 memcpy call (in the worst case). It may seem that 2nd
malloc can place m->inv in memory right after m->m but: 1) compiler
can't rely on that behaviour 2) allocator will insert some private
data before each block leading to more cache misses.
I made some testing with Torcs and Yo Frankie blender game and
according to perf in Yo Frankie _math_matrix_copy overhead reduced by
0.03% - 0.04% while Torcs didn't see any improvement.

Sorry for the duplicate emails.

> Some style comments below
>
> 2018-04-17 1:03 GMT+02:00 Vlad Golovkin <vlad.golovkin.mail at gmail.com>:
>> When GLmatrix elements and its inverse are stored contiguously in memory it is possible to
>> allocate, free and copy these fields with 1 function call instead of 2.
>> ---
>>  src/mesa/math/m_matrix.c | 15 +++++++++------
>>  1 file changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/src/mesa/math/m_matrix.c b/src/mesa/math/m_matrix.c
>> index 57a49533de..4ab78a1fb3 100644
>> --- a/src/mesa/math/m_matrix.c
>> +++ b/src/mesa/math/m_matrix.c
>> @@ -1438,8 +1438,7 @@ _math_matrix_is_dirty( const GLmatrix *m )
>>  void
>>  _math_matrix_copy( GLmatrix *to, const GLmatrix *from )
>>  {
>> -   memcpy(to->m, from->m, 16 * sizeof(GLfloat));
>> -   memcpy(to->inv, from->inv, 16 * sizeof(GLfloat));
>> +   memcpy(to->m, from->m, 16 * 2 * sizeof(GLfloat));
>>     to->flags = from->flags;
>>     to->type = from->type;
>>  }
>> @@ -1470,12 +1469,17 @@ _math_matrix_loadf( GLmatrix *mat, const GLfloat *m )
>>  void
>>  _math_matrix_ctr( GLmatrix *m )
>>  {
>> -   m->m = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 );
>> +   m->m = _mesa_align_malloc( 16 * 2 * sizeof(GLfloat), 16 );
>>     if (m->m)
>> +   {
>
> Our style guides says to keep the curly bracket after an if on the same line.
>
>> +      m->inv = m->m + 16;
>>        memcpy( m->m, Identity, sizeof(Identity) );
>> -   m->inv = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 );
>> -   if (m->inv)
>>        memcpy( m->inv, Identity, sizeof(Identity) );
>> +   }
>> +   else
>> +   {
>
> } else {
>
> Although I see that this file defaults to;
>
> {
> else {
>
> for some reason. Feel free to follow existing style, or adjust to my comments.
> Also, if we want to do this change it deserves a comment in the source.
>> +      m->inv = NULL;
>> +   }
>>     m->type = MATRIX_IDENTITY;
>>     m->flags = 0;
>>  }
>> @@ -1493,7 +1497,6 @@ _math_matrix_dtr( GLmatrix *m )
>>     _mesa_align_free( m->m );
>>     m->m = NULL;
>>
>> -   _mesa_align_free( m->inv );
>>     m->inv = NULL;
>>  }
>>
>> --
>> 2.14.1
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev