[PATCH 02/13] glamor: Add glamor_program based copy acceleration
Markus Wick
markus at selfnet.de
Tue May 13 10:09:15 PDT 2014
Am 2014-05-13 17:34, schrieb Keith Packard:
> Sure, if glsl had a 'round' function I'd use it in a second :-)
It was added in glsl130. As you use uvec which was also added in
glsl130, it's fine.
>> I'm a bit worried about this loop order. We don't change the programm
>> at
>> all, so glamor_use_program will be almost a no-op. But we change the
>> FBO
>> in the inner loop. Large textures are quite uncommon, but this sounds
>> like a waste of time for me.
> I'm not sure what you'd hope to save; we need to bind the source
> texture
> and set some of the uniforms for each source FBO. The only work
> glamor_use_program does other than that is to call glUseProgram, which
> we're going to assume short-circuits setting the program to the
> current
> value.
I hope to save some framebuffer switching. As framebuffer switches needs
much more validating than texture binding or uniform updates, it should
be moved to the outer loop.
>> So there is still no util function for this scissor box handling?
> Do you want something that takes x/y/w/h?
I'm more thinking about a box loop.
>> iirc we always have an element buffer bound for quads, so we could use
>> glDrawElements instead of this for loop.
> Except that GLES doesn't have quads...
So that's what the element buffer is for. Just emit 6 vertices as
triangles per quad and you'll get your quads :)
0 1 2 0 2 3 4 5 6 4 6 7 ...
>> What would be the "last second"? I guess the first copy is always fine
>> and the second one will fail. So we always have to stall and to
>> readback
>> a texture. imo it doesn't matter that much to make a gpu based copy
>> first. In the end, it's likely faster for FB to copy without
>> overlapping.
> I'm not sure I understand your comment here; the 'last second' comment
> I
> stuck in there related to a copy under a plane mask, which will never
> work, and so we want to not create the temporary surface only to
> discard
> it and do an fb fallback for the planemask.
I wanted to say that we don't have to discard the temp copy directly. We
can still copy by fb from there. Maybe this has some advantages, but I
doubt.
>> I think a comment is required here why we have to call
>> glTextureBarrierNV at all. eg what happens when we get two copy calls
>> in
>> a row, both doesn't overlap, but they overlap each other. So the
>> second
>> call might want to read the result of the first copy which isn't
>> allowed
>> without this barrier.
>
> Eric provided an updated comment for this; does this have enough
> detail?
>
> /**
> * Returns TRUE if the copy has to be implemented with
> * glamor_copy_fbo_fbo_temp() instead of glamor_copy_fbo_fbo().
> *
> * If the src and dst are in the same pixmap, then
> glamor_copy_fbo_fbo()'s
> * sampling would give undefined results (since the same texture would
> be
> * bound as an FBO destination and as a texture source). However, if
> we
> * have GL_NV_texture_barrier, we can take advantage of the exception
> it
> * added:
> *
> * "- If a texel has been written, then in order to safely read the
> result
> * a texel fetch must be in a subsequent Draw separated by the
> command
> *
> * void TextureBarrierNV(void);
> *
> * TextureBarrierNV() will guarantee that writes have completed and
> caches
> * have been invalidated before subsequent Draws are executed."
> */
This commend doesn't describe why we have to call glTextureBarrierNV
without overlapping copys at all. We only need it for multiple X11 copy
calls.
More information about the xorg-devel
mailing list