Try to address the drm_debugfs issues

Daniel Vetter daniel at ffwll.ch
Thu Feb 16 19:57:55 UTC 2023


On Thu, Feb 16, 2023 at 05:31:50PM +0100, Christian König wrote:
> 
> 
> Am 16.02.23 um 12:34 schrieb Daniel Vetter:
> > On Thu, Feb 09, 2023 at 03:06:10PM +0100, Christian König wrote:
> > > Am 09.02.23 um 14:06 schrieb Maíra Canal:
> > > > On 2/9/23 09:13, Christian König wrote:
> > > > > Am 09.02.23 um 12:23 schrieb Maíra Canal:
> > > > > > On 2/9/23 05:18, Christian König wrote:
> > > > > > > Hello everyone,
> > > > > > > 
> > > > > > > the drm_debugfs has a couple of well known design problems.
> > > > > > > 
> > > > > > > Especially it wasn't possible to add files between
> > > > > > > initializing and registering
> > > > > > > of DRM devices since the underlying debugfs directory wasn't
> > > > > > > created yet.
> > > > > > > 
> > > > > > > The resulting necessity of the driver->debugfs_init()
> > > > > > > callback function is a
> > > > > > > mid-layering which is really frowned on since it creates a horrible
> > > > > > > driver->DRM->driver design layering.
> > > > > > > 
> > > > > > > The recent patch "drm/debugfs: create device-centered
> > > > > > > debugfs functions" tried
> > > > > > > to address those problem, but doesn't seem to work
> > > > > > > correctly. This looks like
> > > > > > > a misunderstanding of the call flow around
> > > > > > > drm_debugfs_init(), which is called
> > > > > > > multiple times, once for the primary and once for the render node.
> > > > > > > 
> > > > > > > So what happens now is the following:
> > > > > > > 
> > > > > > > 1. drm_dev_init() initially allocates the drm_minor objects.
> > > > > > > 2. ... back to the driver ...
> > > > > > > 3. drm_dev_register() is called.
> > > > > > > 
> > > > > > > 4. drm_debugfs_init() is called for the primary node.
> > > > > > > 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > > > > > >      drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
> > > > > > > to add the files
> > > > > > >      for the primary node.
> > > > > > > 6. The driver->debugfs_init() callback is called to add
> > > > > > > debugfs files for the
> > > > > > >      primary node.
> > > > > > > 7. The added files are consumed and added to the primary
> > > > > > > node debugfs directory.
> > > > > > > 
> > > > > > > 8. drm_debugfs_init() is called for the render node.
> > > > > > > 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > > > > > >      drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
> > > > > > > to add the files
> > > > > > >      again for the render node.
> > > > > > > 10. The driver->debugfs_init() callback is called to add
> > > > > > > debugfs files for the
> > > > > > >       render node.
> > > > > > > 11. The added files are consumed and added to the render
> > > > > > > node debugfs directory.
> > > > > > > 
> > > > > > > 12. Some more files are added through drm_debugfs_add_file().
> > > > > > > 13. drm_debugfs_late_register() add the files once more to
> > > > > > > the primary node
> > > > > > >       debugfs directory.
> > > > > > > 14. From this point on files added through
> > > > > > > drm_debugfs_add_file() are simply ignored.
> > > > > > > 15. ... back to the driver ...
> > > > > > > 
> > > > > > > Because of this the dev->debugfs_mutex lock is also
> > > > > > > completely pointless since
> > > > > > > any concurrent use of the interface would just randomly
> > > > > > > either add the files to
> > > > > > > the primary or render node or just not at all.
> > > > > > > 
> > > > > > > Even worse is that this implementation nails the coffin for
> > > > > > > removing the
> > > > > > > driver->debugfs_init() mid-layering because otherwise
> > > > > > > drivers can't control
> > > > > > > where their debugfs (primary/render node) are actually added.
> > > > > > > 
> > > > > > > This patch set here now tries to clean this up a bit, but
> > > > > > > most likely isn't
> > > > > > > fully complete either since I didn't audit every driver/call path.
> > > > > > I tested the patchset on the v3d, vc4 and vkms and all the files
> > > > > > are generated
> > > > > > as expected, but I'm getting the following errors on dmesg:
> > > > > > 
> > > > > > [    3.872026] debugfs: File 'v3d_ident' in directory '0'
> > > > > > already present!
> > > > > > [    3.872064] debugfs: File 'v3d_ident' in directory '128'
> > > > > > already present!
> > > > > > [    3.872078] debugfs: File 'v3d_regs' in directory '0' already
> > > > > > present!
> > > > > > [    3.872087] debugfs: File 'v3d_regs' in directory '128'
> > > > > > already present!
> > > > > > [    3.872097] debugfs: File 'measure_clock' in directory '0'
> > > > > > already present!
> > > > > > [    3.872105] debugfs: File 'measure_clock' in directory '128'
> > > > > > already present!
> > > > > > [    3.872116] debugfs: File 'bo_stats' in directory '0' already
> > > > > > present!
> > > > > > [    3.872124] debugfs: File 'bo_stats' in directory '128'
> > > > > > already present!
> > > > > > 
> > > > > > It looks like the render node is being added twice, since this
> > > > > > doesn't happen
> > > > > > for vc4 and vkms.
> > > > > Thanks for the feedback and yes that's exactly what I meant with
> > > > > that I haven't looked into all code paths.
> > > > > 
> > > > > Could it be that v3d registers it's debugfs files from the
> > > > > debugfs_init callback?
> > > > Although this is true, I'm not sure if this is the reason why the files
> > > > are
> > > > being registered twice, as this doesn't happen to vc4, and it also uses
> > > > the
> > > > debugfs_init callback. I believe it is somewhat related to the fact that
> > > > v3d is the primary node and the render node.
> > > I see. Thanks for the hint.
> > > 
> > > > Best Regards,
> > > > - Maíra Canal
> > > > 
> > > > > One alternative would be to just completely nuke support for
> > > > > separate render node debugfs files and only add a symlink to the
> > > > > primary node. Opinions?
> > > What do you think of this approach? I can't come up with any reason why we
> > > should have separate debugfs files for render nodes and I think it is pretty
> > > much the same reason you came up with the patch for per device debugfs files
> > > instead of per minor.
> > Yeah I think best is to symlink around a bit for compat. I thought we
> > where doing that already, and you can't actually create debugfs files on
> > render nodes? Or did I only dream about this?
> 
> No, we still have that distinction around unfortunately.
> 
> That's why this went boom for me in the first place.

I guess time to land that? Or should we do this as part of the conversion
and just change the new add_file helpers to only instantiate on the
primary node until all the old users are gone?
-Daniel

> 
> Christian.
> 
> > -Daniel
> > 
> > > Regards,
> > > Christian.
> > > 
> > > > > Regards,
> > > > > Christian.
> > > > > 
> > > > > > Otherwise, the patchset looks good to me, but maybe Daniel has
> > > > > > some other
> > > > > > thoughts about it.
> > > > > > 
> > > > > > Best Regards,
> > > > > > - Maíra Canal
> > > > > > 
> > > > > > > Please comment/discuss.
> > > > > > > 
> > > > > > > Cheers,
> > > > > > > Christian.
> > > > > > > 
> > > > > > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list