[PATCH xserver 1/2] modesetting: Fix reverse prime partial update issues on secondary GPU outputs

Fri Sep 16 10:58:38 UTC 2016

Hans de Goede <hdegoede at redhat.com> writes:

> Hi,
>
> On 16-09-16 09:58, Michel Dänzer wrote:
>> On 16/09/16 04:18 PM, Hans de Goede wrote:
>>> On 16-09-16 04:00, Michel Dänzer wrote:
>>>> On 16/09/16 06:50 AM, Eric Anholt wrote:
>>>>> Hans de Goede <hdegoede at redhat.com> writes:
>>>>>
>>>>>> When using reverse prime we do 2 copies, 1 from the primary GPU's
>>>>>> framebuffer to a shared pixmap and 1 from the shared pixmap to the
>>>>>> secondary GPU's framebuffer.
>>>>>>
>>>>>> This means that on the primary GPU side the copy MUST be finished,
>>>>>> before we start the second copy (before the secondary GPU's driver
>>>>>> starts processing the damage on the shared pixmap).
>>>>>>
>>>>>> This fixes secondary outputs sometimes showning (some) old fb contents,
>>>>>> because of the 2 copies racing with each other, for an example of
>>>>>> what this looks like see:
>>>>>
>>>>> Is working around the fact that the primary and secondary aren't
>>>>> cooperating on dmabuf fencing?  Should they be doing that instead?
>>>>>
>>>>> Or would glamor_flush be sufficient?
>>>>
>>>> Yes, glamor_flush is sufficient if the kernel drivers handle fences
>>>> correctly.
>>>
>>> I will admit that I'm not familiar with all the intrinsics involved here,
>>> but I do not see how glamor_flush would be sufficient.
>>>
>>> We must guarantee that the first copy is complete before the second
>>> copy is started. I think that with taking fencing into account this
>>> turns into must make sure the first copy has started, because once
>>> started then the gpu doing the first copy owns the buffer until
>>> it is completed.
>>>
>>> But AFAIK flush does not guarantee that the copy has started, only
>>> that it will start real soon now.
>>
>> Section 2.3.2 of the OpenGL 4.3 specification says:
>>
>>  Coarse control over command queues is available using the command
>>
>> 	void Flush( void );
>>
>>  which causes all previously issued GL commands to complete in finite
>>  time (although such commands may still be executing when Flush
>>  returns).
>>
>> Which to me suggests that the GPU commands should have started executing
>> when glFlush returns. AFAIK that's the case with all Mesa drivers at least.
>
> Ok, so I just tried to switch to flush() (the proof is in the pudding) that
> does seem to fix the race between the 2 copies, but it does not ensure that
> the copies happen in the right order! So sometimes I end up with old contents
> on the secondary gpu output.

Yeah, it may be with the driver multithreading these days that our old
assumptions are no longer valid.  One could imagine using
EGL_KHR_fence_sync to move a proper sync between the two screens, but I
don't want to block this fix on using that.

So, on further thought, this patch is:

Reviewed-by: Eric Anholt <eric at anholt.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 800 bytes
Desc: not available
URL: <https://lists.x.org/archives/xorg-devel/attachments/20160916/4b49006a/attachment.sig>