Plumbing explicit synchronization through the Linux ecosystem
Michel Dänzer
michel at daenzer.net
Tue Mar 17 10:01:57 UTC 2020
On 2020-03-16 7:33 p.m., Marek Olšák wrote:
> On Mon, Mar 16, 2020 at 5:57 AM Michel Dänzer <michel at daenzer.net> wrote:
>> On 2020-03-16 4:50 a.m., Marek Olšák wrote:
>>> The synchronization works because the Mesa driver waits for idle (drains
>>> the GFX pipeline) at the end of command buffers and there is only 1
>>> graphics queue, so everything is ordered.
>>>
>>> The GFX pipeline runs asynchronously to the command buffer, meaning the
>>> command buffer only starts draws and doesn't wait for completion. If the
>>> Mesa driver didn't wait at the end of the command buffer, the command
>>> buffer would finish and a different process could start execution of its
>>> own command buffer while shaders of the previous process are still
>> running.
>>>
>>> If the Mesa driver submits a command buffer internally (because it's
>> full),
>>> it doesn't wait, so the GFX pipeline doesn't notice that a command buffer
>>> ended and a new one started.
>>>
>>> The waiting at the end of command buffers happens only when the flush is
>>> external (Swap buffers, glFlush).
>>>
>>> It's a performance problem, because the GFX queue is blocked until the
>> GFX
>>> pipeline is drained at the end of every frame at least.
>>>
>>> So explicit fences for SwapBuffers would help.
>>
>> Not sure what difference it would make, since the same thing needs to be
>> done for explicit fences as well, doesn't it?
>
> No. Explicit fences don't require userspace to wait for idle in the command
> buffer. Fences are signalled when the last draw is complete and caches are
> flushed. Before that happens, any command buffer that is not dependent on
> the fence can start execution. There is never a need for the GPU to be idle
> if there is enough independent work to do.
I don't think explicit fences in the context of this discussion imply
using that different fence signalling mechanism though. My understanding
is that the API proposed by Jason allows implicit fences to be used as
explicit ones and vice versa, so presumably they have to use the same
signalling mechanism.
Anyway, maybe the different fence signalling mechanism you describe
could be used by the amdgpu kernel driver in general, then Mesa could
drop the waits for idle and get the benefits with implicit sync as well?
--
Earthling Michel Dänzer | https://redhat.com
Libre software enthusiast | Mesa and X developer
More information about the xorg-devel
mailing list