GARTSize option not documented on radeon and other problems

Zoltan Boszormenyi zboszor at dunaweb.hu
Fri May 4 01:02:16 PDT 2007


Jerome Glisse írta:
> On 5/4/07, Zoltan Boszormenyi <zboszor at dunaweb.hu> wrote:
>   
>> Oliver McFadden írta:
>>     
>>> On 5/3/07, Zoltan Boszormenyi <zboszor at dunaweb.hu> wrote:
>>>
>>>       
>>>> Hi,
>>>>
>>>> sorry for the crossposting, I don't know who to address.
>>>>
>>>> I am experimenting the new CFS scheduler on Linux
>>>> and tried to start multiple glxgears to see whether
>>>> they are really running smooth and have evenly
>>>> distributed framerate.
>>>>
>>>> At first I could only start two instances of glxgears
>>>> but the third didn't start saying that it cannot allocate
>>>> GART memory and try to increase increase GARTSize.
>>>>
>>>> First problem: man radeon doesn't have anything about
>>>> this option, although radeon_drv.so contains this keyword.
>>>> I tried guessing whether the parameter is in MBs and
>>>> have set it to 128 but it disabled DRI because of some
>>>> out of memory condition. Setting it to 64 gave me working
>>>> DRI and I am able to start up some more instances of
>>>> glxgears.
>>>>
>>>> Second problem: if I start 16 of them, the last 3
>>>> behaves very strange, i.e. instead of the spinning gears
>>>> I get chaotic flickering triangles. As soon as the number
>>>> of glxgears goes down to 13 every window behaves
>>>> normal.
>>>>
>>>> Third problem: starting up 32 glxgears locked up the
>>>> machine almost instantly but having only 16 of them
>>>> also locks up the machine after some time passed.
>>>>
>>>> The machine is x86-64, Athlon64 X2 4200,
>>>> PCIe Radeon X700 with 128MB onboard memory,
>>>> up-to-date Fedora Core 6.
>>>>
>>>> Best regards,
>>>> Zoltán Böszörményi
>>>>
>>>>         
>>> This is interesting. I've occasionally seen my engine just display a mess of
>>> triangles, but if I was to kill it and start it again, it would work fine. This
>>> only happened very occasionally so I could never track down the problem.
>>>
>>> If running more than 13 glxgears shows the problem then maybe it would have a
>>> chance of getting fixed. :) You might want to open a bug report on the
>>> Freedesktop Bugzilla.
>>>
>>>       
>> BZ #10852
>>
>>     
>>> The lockups are nothing new, unfortionally. I think there is a bug somewhere in
>>> the R300 driver. You can also get deadlocks which are a little different... I
>>> think that these might go away after R300 uses TTM and thus doesn't grab the
>>> hardware lock anymore for texture upload, etc.
>>>
>>>       
>> Is the lockup happens with only multiple GL[X] clients?
>> I can play hours with Diablo 2 :-) in Wine and it doesn't lock up.
>>
>> Best regards,
>> Zoltán Böszörményi
>>     
>
> We believe lockup happen with high memory traffic (using big
> texture or sending meg of datas to the card). I believe Diablo
> isn't especialy such a traffic whore apps.
>   

But neither is glxgears. If I have a small number of them, say 2-3,
I don't experience any lockup.

> For your test on the cfs scheduler i don't think drm is good
> to test with. From my understanding i would say that any
>   

But it seems that the mainstream scheduler in Linux cannot keep
even small number of GL clients run smoothly under load.
My test was to run make -j4 on the kernel source while
running 12 glxgears. All 12 gears were smoothly running.
And in the meantime, swithing workspaces worked quickly,
i.e. repainting the large windows of Firefox and Thunderbird
were although not instant but quite quick despite the high load.
This kind of load makes the mainstream scheduler stall for
several seconds which cannot be observed under Ingo Molnar's
new scheduler.

> apps that get the gpu lockup can basicly starve other app
> for it i.e. we haven't any scheduling or fair sharing of the gpu.
> This isn't at all easy to solve and i think we kind of don't care
> if app can DOS gpu.
>
> best,
> Jerome Glisse
>   

The lockup I mentioned wasn't a complete machine lock-up.
Although NumLock didn't work and the mouse pointer disappeared
from the screen, the machine was still working. I.e. the kernel
responded to Alt+SysRq commands (sync, remount-readonly, poweroff)
So it must be some software locking problem.

Best regards,
Zoltán Böszörményi




More information about the xorg mailing list