[V2] modesetting: Fix X crash in ms_dirty_update()

Alex Goins agoins at nvidia.com
Tue Aug 21 00:06:32 UTC 2018


Hi JimQu,

I'm not sure that I fully understand the question, so I'll just try to cover all
the bases, apologies if I miss what you are asking.

The output master device (typically dGPU, also known as the 'source') is that
which is driving the true X screen, whereas the output slave device (typically
iGPU, also known as the 'sink') is associated with the GPU screen, which doesn't
do any rendering, just display. 'Reverse PRIME' refers to when a dGPU is the
slave/sink, although that terminology has been confusing to me in the past.

Your original description seems to claim that for the crashing config, the AMD
dGPU is actually the output slave/sink (GPU screen), and the Intel iGPU is the
output master/source (X screen). I suppose that would be considered reverse
PRIME.

The modesetting driver is meant to store PRIME parameters that are
pixmap-specific in the associated msPixmapPrivRec. The issue here is that there
is a different copy of the pixmap for the output master and the output slave,
respectively. You'll notice in the definition of msPixmapPrivRec that some
fields are designated for the sink, and others are designated for the source.
The sink fields should only be set/get from the slave copy of the pixmap, and
the source fields should only be set/get from the master copy of the pixmap.
Currently, they are all being set/get from the slave copy. This ends up working
out fine if the modesetting driver is being used for both master and slave (such
as Nouveau+Intel), or if the modesetting driver is being used only as the slave
(such as NVIDIA+Intel or AMD+Intel non-reverse PRIME), but not if the
modesetting driver is being used as a master with a non-modesetting slave (such
as with Intel+AMD reverse PRIME, what I assume is your config).

So, the core of the fix should just be making sure that in any case where the
modesetting driver sets/gets pixmap-specific PRIME parameters that are
associated with the master/source (as specified in the comments in the
msPixmapPrivRec definition), it does so via the master copy of the pixmap, as
you've done for ms_dirty_update() already.

Thanks,
Alex

On Mon, 20 Aug 2018, jimqu wrote:

> 
> Hi AlexG,
> 
> I just has a question.
> 
> Did you suppose the output master device(dGPU) is equivalent to master screen, output slave(iGPU) device
> is equivalent to slave screen in modesetting driver under PRIME(not reverse) case?
> 
> Thanks
> 
> JimQu
> 
> 
> On 2018年08月18日 02:55, Alex Goins wrote:
> 
> I think this patch is along the right lines, but misses part of the fix.
> Although I don't have access to a modesetting<->modesetting PRIME output slaving
> setup to test this theory right now, I think this will break it. Have you tested
> that this path is exercised?
> 
> It is true that SharedPixmapNotifyDamage() is a function implemented by the
> slave that should only be called by the master, and that ppriv->notify_on_damage
> is supposed to be set on master's ppriv. The current implementation is
> incorrect, as you've found, but it still worked just fine for
> modesetting<->modesetting PRIME output slaving. Why is that?
> 
> The control flow should be as follows (and only applies to PRIME Sync):
> 
> --
> 
> 1) Slave calls master->PresentSharedPixmap(), and it fails (see
> drmmode_SharedPixmapPresent).
> 
> 2) Slave calls master->RequestSharedPixmapNotifyDamage(pSlavePix) to request for
> the master to call slave->SharedPixmapNotifyDamage(pSlavePix) when the backing
> master pixmap receives damage. If the master is the modesetting driver, the
> master sets pMasterPixPriv->notify_on_damage = TRUE to know that it needs to
> notify the slave when damage is received.
> 
> 3) When the master receives damage, if pMasterPixPriv->notify_on_damage == TRUE,
> it calls slave->SharedPixmapNotifyDamage(pSlavePix) and then sets
> pMasterPixPriv->notify_on_damage = FALSE.
> 
> --
> 
> There are currently errors in both step 2 and step 3. In step 2, the modesetting
> driver as master currently sets pSlavePixPriv->notify_on_damage = TRUE instead
> of pMasterPixPriv->notify_on_damage = TRUE.  In step 3, the modesetting driver
> as master checks pSlavePixPriv->notify_on_damage == TRUE, and if it passes,
> calls pSlave->SharedPixmapNotifyDamage(pSlavePix). This currently works on
> modesetting<->modesetting PRIME because while both are wrong, they are
> consistently wrong, and on modesetting<->modesetting PRIME setups, pSlavePixPriv
> is accessible to both master and slave. All of the attributes are being stored
> in and accessed from pSlavePixPriv, whereas the master-related ones should
> actually be stored in pMasterPixPriv.
> 
> My understanding of your setup is that you have modesetting as the master, and
> AMD as the slave. Thus, when the master attempts to access pSlavePixPriv,
> problems result due to the fact that it's accessing AMD's 'ppriv' using the
> modesetting structure definition.
> 
> Your patch fixes step 3 by getting 'ppriv' from the master pixmap
> (pMasterPixPriv as above), rather than the slave. The !screen->isGPU check is
> probably good to have, but it's probably redundant because pixmap_dirty_list
> should be empty for GPU screens. I think that is all fine.
> 
> The patch also needs to fix step 2, again by getting 'ppriv' from the master
> pixmap rather than the slave, but this time for
> msRequestSharedPixmapNotifyDamage(). Without that,
> pMasterPixPriv->notify_on_damage will always be FALSE, breaking
> modesetting<->modesetting PRIME. Moreover, it could cause more problems like the
> original bug if a non-modesetting slave were to call
> master->RequestSharedPixmapNotifyDamage(pSlavePix).
> 
> I was also confused as to why slave->SharedPixmapNotifyDamage() is being called at
> all on your setup, as 'randr: falling back to unsynchronized pixmap sharing'
> indicates that PRIME Sync is disabled which should mean that
> ppriv->notify_on_damage should always be FALSE. I think that's just because
> ppriv is currently that of the AMD slave, so it's reading some random part of
> AMD's ppriv using modesetting's structure definition.
> 
> I think if you fix msRequestSharedPixmapNotifyDamage() as described above, this
> patch should be good to go, but as it stands, it will likely introduce a regression in
> modesetting<->modesetting PRIME. Please fix that, then verify that
> modesetting<->modesetting PRIME (with PRIME Sync) still works. I can also verify
> it myself once I have access to a test setup that will work with it.
> 
> Thanks,
> Alex
> 
> On Wed, 8 Aug 2018, Alex Deucher wrote:
> 
> On Sun, Aug 5, 2018 at 10:40 PM, Jim Qu <Jim.Qu at amd.com> wrote:
> 
> On some Intel iGPU + AMD dGPU platform, when connect extern display
> from dGPU, X will crash, show the log like:
> 
> randr: falling back to unsynchronized pixmap sharing
> (EE)
> (EE) Backtrace:
> (EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x4e)
> (EE) 1: /usr/lib/xorg/Xorg (0x55cb0151a000+0x1b5ce9)
> (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f1587a1d000+0x11390)
> (EE)
> (EE) Segmentation fault at address 0x0
> (EE)
> 
> There is NULL pointer accessing on ent->slave_dst->drawable.pScreen->
> SharedPixmapNotifyDamage.
> 
> On the platform, since the dGPU is GPU device, so that the iGPU is
> output master device. SharedPixmapNotifyDamage() should be called when
> current device is output master.
> 
> Change-Id: I8fa6922a4f75b5e068970fc4d362f778052379f2
> Signed-off-by: Jim Qu <Jim.Qu at amd.com>
> 
> Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
> 
> 
> ---
>  hw/xfree86/drivers/modesetting/driver.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/xfree86/drivers/modesetting/driver.c b/hw/xfree86/drivers/modesetting/driver.c
> index 9362370..37fafb1 100644
> --- a/hw/xfree86/drivers/modesetting/driver.c
> +++ b/hw/xfree86/drivers/modesetting/driver.c
> @@ -640,19 +640,21 @@ ms_dirty_update(ScreenPtr screen, int *timeout)
>      xorg_list_for_each_entry(ent, &screen->pixmap_dirty_list, ent) {
>          region = DamageRegion(ent->damage);
>          if (RegionNotEmpty(region)) {
> -            msPixmapPrivPtr ppriv =
> -                msGetPixmapPriv(&ms->drmmode, ent->slave_dst);
> +            if (!screen->isGPU) {
> +                    msPixmapPrivPtr ppriv =
> +                        msGetPixmapPriv(&ms->drmmode, ent->slave_dst->master_pixmap);
> 
> -            if (ppriv->notify_on_damage) {
> -                ppriv->notify_on_damage = FALSE;
> +                    if (ppriv->notify_on_damage) {
> +                        ppriv->notify_on_damage = FALSE;
> 
> -                ent->slave_dst->drawable.pScreen->
> -                    SharedPixmapNotifyDamage(ent->slave_dst);
> -            }
> +                        ent->slave_dst->drawable.pScreen->
> +                            SharedPixmapNotifyDamage(ent->slave_dst);
> +                    }
> 
> -            /* Requested manual updating */
> -            if (ppriv->defer_dirty_update)
> -                continue;
> +                    /* Requested manual updating */
> +                    if (ppriv->defer_dirty_update)
> +                        continue;
> +            }
> 
>              redisplay_dirty(screen, ent, timeout);
>              DamageEmpty(ent->damage);
> --
> 2.7.4
> 
> _______________________________________________
> xorg-devel at lists.x.org: X.Org development
> Archives: http://lists.x.org/archives/xorg-devel
> Info: https://lists.x.org/mailman/listinfo/xorg-devel
> 
> _______________________________________________
> xorg-devel at lists.x.org: X.Org development
> Archives: http://lists.x.org/archives/xorg-devel
> Info: https://lists.x.org/mailman/listinfo/xorg-devel
> 
> 
> 
> 


More information about the xorg-devel mailing list