DMA ring test failure on Radeon HD 8970M

Joe Julian joe at julianfamily.org
Tue Dec 10 13:26:09 PST 2013


On 12/10/2013 06:23 AM, Alex Deucher wrote:
> On Tue, Dec 10, 2013 at 1:16 AM, Joe Julian <joe at julianfamily.org> wrote:
>> On Wed, 2013-12-04 at 12:08 -0500, Alex Deucher wrote:
>>> On Wed, Dec 4, 2013 at 12:41 AM, Joe Julian <joe at julianfamily.org> wrote:
>>>> On Tue, 2013-12-03 at 09:28 -0500, Alex Deucher wrote:
>>>>> On Tue, Dec 3, 2013 at 8:20 AM, Joe Julian <joe at julianfamily.org> wrote:
>>>>>>
>>>>>> Alex Deucher <alexdeucher at gmail.com> wrote:
>>>>>>> On Fri, Nov 29, 2013 at 7:30 PM, Joe Julian <joe at julianfamily.org>
>>>>>>> wrote:
>>>>>>>> This MSI laptop has two crossfire connected video processors in it,
>>>>>>>> 00:01.0 has a HD 8650G that seems to initialize properly, and at
>>>>>>> 01:00.0
>>>>>>>> an HD 8970M that fails the ring 3 test, "radeon: ring 3 test failed
>>>>>>> (0xDFCFFBFF)".
>>>>>>>
>>>>>>> It looks like there's a problem with the rom for the dGPU:
>>>>>>>
>>>>>>> [   61.008250] ACPI Error: Field [TEMP] at 524288 exceeds Buffer
>>>>>>> [TVGA] size 512000 (bits) (20130927/dsopcode-236)
>>>>>>> [   61.008749] ACPI Error: Method parse/execution failed
>>>>>>> [\_SB_.PCI0.VGA_.ATRM] (Node ffff880233ad1e30), AE_AML_BUFFER_LIMIT
>>>>>>> (20130927/psparse-536)
>>>>>>> [   61.009991] failed to evaluate ATRM got AE_AML_BUFFER_LIMIT
>>>>>>> [   61.010204] ATOM BIOS: MSI
>>>>>>> [   61.010270] [drm] GPU not posted. posting now...
>>>>>>> [   61.018737] radeon 0000:01:00.0: limiting VRAM
>>>>>>> [   61.018765] radeon 0000:01:00.0: VRAM: 1047552M 0x0000000000000000
>>>>>>> - 0x000000FFBFFFFFFF (1047552M used)
>>>>>>> [   61.018814] radeon 0000:01:00.0: GTT: 1024M 0x000000FFC0000000 -
>>>>>>> 0x000000FFFFFFFFFF
>>>>>>> [   61.018853] [drm] Detected VRAM RAM=1047552M, BAR=256M
>>>>>>>
>>>>>>> 1047552M of vram is obviously wrong.  How much vram is supposed to be
>>>>>>> on the card?
>>>>>> According to Windows, 2GB:
>>>>>> Name    AMD Radeon(TM) HD8970M
>>>>>> PNP Device ID    PCI\VEN_1002&DEV_6801&SUBSYS_10F11462&REV_00\4&99EBB28&0&0018
>>>>>> Adapter Type    AMD Radeon Graphics Processor (0x6801), Advanced Micro Devices, Inc. compatible
>>>>>> Adapter Description    AMD Radeon(TM) HD8970M
>>>>>> Adapter RAM    (2,147,483,648) bytes
>>>>>> Installed Drivers    aticfx64.dll,aticfx64.dll,aticfx64.dll,aticfx32,aticfx32,aticfx32,atiumd64.dll,atidxx64.dll,atidxx64.dll,atiumdag,atidxx32,atidxx32,atiumdva,atiumd6a.cap,atitmm64.dll
>>>>>> Driver Version    13.200.11.0
>>>>>> INF File    oem17.inf (ati2mtag_R576B section)
>>>>>> Color Planes    Not Available
>>>>>> Color Table Entries    4294967296
>>>>>> Resolution    1920 x 1080 x 60 hertz
>>>>>> Bits/Pixel    32
>>>>>> Memory Address    0xD0000000-0xDFFFFFFF
>>>>>> Memory Address    0xFEAC0000-0xFEAFFFFF
>>>>>> I/O Port    0x0000EF00-0x0000EFFF
>>>>>> IRQ Channel    IRQ 4294967283
>>>>>>> Can you send me the output from this patch?
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c
>>>>>>> index 6a64cca..84a7e26 100644
>>>>>>> --- a/drivers/gpu/drm/radeon/si.c
>>>>>>> +++ b/drivers/gpu/drm/radeon/si.c
>>>>>>> @@ -3884,6 +3884,7 @@ static int si_mc_init(struct radeon_device *rdev)
>>>>>>>         /* size in MB on si */
>>>>>>>     rdev->mc.mc_vram_size = RREG32(CONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>>>>>>>   rdev->mc.real_vram_size = RREG32(CONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>>>>>>> +       DRM_INFO("CONFIG_MEMSIZE: 0x%08x\n", RREG32(CONFIG_MEMSIZE));
>>>>>>>         rdev->mc.visible_vram_size = rdev->mc.aper_size;
>>>>>>>         si_vram_gtt_location(rdev, &rdev->mc);
>>>>>>>         radeon_update_bandwidth_info(rdev);
>>>>>>>
>>>>>>>
>>>>>> CONFIG_MEMSIZE=0X03800800
>>>>> Does this patch fix the issues?
>>>>>
>>>>> Alex
>>>>>
>>>>>
>>>>>>>> Throwing in some writel and readl tests before even trying the dma
>>>>>>> test,
>>>>>>>> I see that the memory isn't being changed with writel in the first
>>>>>>>> place.
>>>>>>>>
>>>>>>>> ----
>>>>>>>>          tmp = 0xDEADBEEF;
>>>>>>>>          writel(tmp, ptr);
>>>>>>>>          tmp = readl(ptr);
>>>>>>>>          if (tmp != 0xDEADBEEF)
>>>>>>>>                  DRM_ERROR("radeon: ring %d memory write failed (0x%
>>>>>>>> 08X)\n", ring->idx, tmp);
>>>>>>>> ----
>>>>>>>>
>>>>>>>> radeon: ring 3 memory write failed (0xDFCFFBFF)
>>>>>>>>
>>>>>>>> Looks to me like we're trying to write to a rom address, but I'm a
>>>>>>>> complete novice at this so I could be completely off.
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm using kernel 3.13.0-0.rc1.git3
>>>>>>>>
>>>>>>>> What else could I look at?
>>>>>>>>
>>>>>>>>
>>>> No, that patch does not completely solve the problem. The memsize is
>>>> right, now, but the dma test still fails the same way.
>>>>
>>>> dmesg: http://paste.fedoraproject.org/58848/86134088
>>>>
>>> Something still seems problematic with the rom.  Does this help?
>>>
>>> diff --git a/drivers/gpu/drm/radeon/radeon_bios.c
>>> b/drivers/gpu/drm/radeon/radeon_bios.c
>>> index b3633d9..b2983f2 100644
>>> --- a/drivers/gpu/drm/radeon/radeon_bios.c
>>> +++ b/drivers/gpu/drm/radeon/radeon_bios.c
>>> @@ -173,7 +173,7 @@ static int radeon_atrm_call(acpi_handle
>>> atrm_handle, uint8_t *bios,
>>>   static bool radeon_atrm_get_bios(struct radeon_device *rdev)
>>>   {
>>>          int ret;
>>> -       int size = 256 * 1024;
>>> +       int size = 128 * 1024;
>>>          int i;
>>>          struct pci_dev *pdev = NULL;
>>>          acpi_handle dhandle, atrm_handle;
>>>
>>> You might also try 64 rather than 128 if 128 doesn't work.
>>>
>>> Alex
>> Those had no effect.
>>
>> I hacked the dsdt to fix the ATRM error. I don't know if this is
>> something that can be worked around or if I'll have to have MSI fix
>> their bios, but if it can, here's a successful init once the bios can be
>> read:
> How did you fix the dsdt?  What did you have to change?
>
>> http://paste.fedoraproject.org/60388/86655915
>>
In the following I tried changing VROM to 512000 like every other 
version I could find, but it still overflowed, so I changed the TVGA 
buffer to 0x10000 (big enough to hold 524288) which cleared that error.

I'm still confused how we're getting to that code path since Arg0 is 
0xf000 and Arg1 is 0x1000 which should add up to 0x10000, which is not 
<= 32. Everything I've read suggests that integers should be 64 bit but 
it seems to be being compared as a 16 bit word.

      Field (REVD, AnyAcc, NoLock, Preserve)
         {
             SROM,   32,
             VROM,   524288
         }

         Name (TVGA, Buffer (0xFA00)
         {
              0x00
         })
         Method (ATRM, 2, Serialized)
         {
             Add (Arg0, Arg1, Local0)
             If (LLessEqual (Local0, SROM))
             {
                 Multiply (Arg1, 0x08, Local1)
                 Multiply (Arg0, 0x08, Local2)
                 Store (VROM, TVGA)
                 CreateField (TVGA, Local2, Local1, TEMP)
                 Name (RETB, Buffer (Arg1) {})
                 Store (TEMP, RETB)
                 Return (RETB)
             }
             Else
             {
                 If (LLess (Arg0, SROM))
                 {
                     Subtract (SROM, Arg0, Local3)
                     Multiply (Local3, 0x08, Local1)
                     Multiply (Arg0, 0x08, Local2)
                     Store (VROM, TVGA)
                     CreateField (TVGA, Local2, Local1, TEM)
                     Name (RETC, Buffer (Local3) {})
                     Store (TEM, RETC)
                     Return (RETC)
                 }
                 Else
                 {
                     Name (RETD, Buffer (One) {})
                     Return (RETD)
                 }
             }
         }


More information about the xorg-driver-ati mailing list