[PATCH 02/13] glamor: Add glamor_program based copy acceleration

Keith Packard keithp at keithp.com
Tue May 13 22:57:07 PDT 2014


Markus Wick <markus at selfnet.de> writes:

> Am 2014-05-13 17:34, schrieb Keith Packard:
>> Sure, if glsl had a 'round' function I'd use it in a second :-)
>
> It was added in glsl130. As you use uvec which was also added in 
> glsl130, it's fine.

Cool. I've done a bit of performance analysis with this change and the
results aren't conclusive yet. Obviously, I'm going to pick the code
which goes faster for me :-)

> I hope to save some framebuffer switching. As framebuffer switches needs 
> much more validating than texture binding or uniform updates, it should 
> be moved to the outer loop.

I'm not too worried about frame buffer switching -- anything allocating
target surfaces larger than we can render to is probably doing something
wrong. I do see applications allocating large source images though, so
we should strive to make that reasonably fast.

Perhaps we should add an x11perf test case that draws from/to enormous
images and see how things look. I've only tested for correctness up to
this point.

> I'm more thinking about a box loop.

If you can write something that looks cleaner, that'd be awesome. I'm
all for making the code as readable as possible.

> So that's what the element buffer is for. Just emit 6 vertices as 
> triangles per quad and you'll get your quads :)
> 0 1 2  0 2 3   4 5 6  4 6 7   ...

Ah, ok. I was doing the easiest non-quad path I could come up with as I
really don't care about GLES :-)

> I wanted to say that we don't have to discard the temp copy directly. We 
> can still copy by fb from there. Maybe this has some advantages, but I 
> doubt.

I can't imagine that would be faster -- you'd have to wait for the copy
to complete before even starting the fallback, and presumably the
fallback won't be any faster this way...

> This commend doesn't describe why we have to call glTextureBarrierNV 
> without overlapping copys at all. We only need it for multiple X11 copy 
> calls.

Not just copy calls, we have to put a barrier before any operation using
the dest as source because *any* rendering occurring before the copy
would need to be correctly synchronized for this to work.

-- 
keith.packard at intel.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 810 bytes
Desc: not available
URL: <http://lists.x.org/archives/xorg-devel/attachments/20140513/bcfd7e69/attachment.sig>


More information about the xorg-devel mailing list