Plumbing explicit synchronization through the Linux ecosystem

Thu Mar 19 10:42:28 UTC 2020

On Tue, Mar 17, 2020 at 11:27:28AM -0500, Jason Ekstrand wrote:
> On Tue, Mar 17, 2020 at 10:33 AM Nicolas Dufresne <nicolas at ndufresne.ca> wrote:
> >
> > Le lundi 16 mars 2020 à 23:15 +0200, Laurent Pinchart a écrit :
> > > Hi Jason,
> > >
> > > On Mon, Mar 16, 2020 at 10:06:07AM -0500, Jason Ekstrand wrote:
> > > > On Mon, Mar 16, 2020 at 5:20 AM Laurent Pinchart wrote:
> > > > > Another issue is that V4L2 doesn't offer any guarantee on job ordering.
> > > > > When you queue multiple buffers for camera capture for instance, you
> > > > > don't know until capture complete in which buffer the frame has been
> > > > > captured.
> > > >
> > > > Is this a Kernel UAPI issue?  Surely the kernel driver knows at the
> > > > start of frame capture which buffer it's getting written into.  I
> > > > would think that the kernel APIs could be adjusted (if we find good
> > > > reason to do so!) such that they return earlier and return a (buffer,
> > > > fence) pair.  Am I missing something fundamental about video here?
> > >
> > > For cameras I believe we could do that, yes. I was pointing out the
> > > issues caused by the current API. For video decoders I'll let Nicolas
> > > answer the question, he's way more knowledgeable that I am on that
> > > topic.
> >
> > Right now, there is simply no uAPI for supporting asynchronous errors
> > reporting when fences are invovled. That is true for both camera's and
> > CODEC. It's likely what all the attempt was missing, I don't know
> > enough myself to suggest something.
> >
> > Now, why Stateless video decoders are special is another subject. In
> > CODECs, the decoding and the presentation order may differ. For
> > Stateless kind of CODEC, a bitstream is passed to the HW. We don't know
> > if this bitstream is fully valid, since the it is being parsed and
> > validated by the firmware. It's also firmware job to decide which
> > buffer should be presented first.
> >
> > In most firmware interface, that information is communicated back all
> > at once when the frame is ready to be presented (which may be quite
> > some time after it was decoded). So indeed, a fence model is not really
> > easy to add, unless the firmware was designed with that model in mind.
> 
> Just to be clear, I think we should do whatever makes sense here and
> not try to slam sync_file in when it doesn't make sense just because
> we have it.  The more I read on this thread, the less out-fences from
> video decode sound like they make sense unless we have a really solid
> plan for async error reporting.  It's possible, depending on how many
> processes are involved in the pipeline, that async error reporting
> could help reduce latency a bit if it let the kernel report the error
> directly to the last process in the chain.  However, I'm not convinced
> the potential for userspace programmer error is worth it..  That said,
> I'm happy to leave that up to the actual video experts. (I just do 3D)

dma_fence has an error state which you can set when things went south. The
fence still completes (to guarantee forward progress).

Currently that error code isn't really propagated anywhere (well i915 iirc
does something like that since it tracks the depedencies internally in the
scheduler). Definitely not at the dma_fence level, since we don't track
the dependency graph there at all. We might want to add that, would at
least be possible.

If we track the cascading dma_fence error state in the kernel I do think
this could work. I'm not sure whether it's actually a good/useful idea
still.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch