May have bricked my GPU
Connor Behan
connor.behan at gmail.com
Mon Mar 26 10:33:52 PDT 2012
On 26/03/12 05:59 AM, Alex Deucher wrote:
> On Sun, Mar 25, 2012 at 5:39 PM, Connor Behan<connor.behan at gmail.com> wrote:
>> I really hope some one here is knowledgeable about this...
>>
>> My r128 card hangs indefinitely if any kind of acceleration is enabled.
>> Running startx results in a black screen and I have to reboot with
>> Alt+SysRq+REISUB. This was not caused by upgrading between any official
>> releases of linux, xorg or my driver. It was caused because I was trying to
>> hack my driver and I can tell you exactly how.
>>
>> I was working on adding EXA support to xf86-video-r128. I successfully got
>> as far as https://bugs.freedesktop.org/show_bug.cgi?id=47866 and I still
>> suggest that you accept this patch since it does not contain the dangerous
>> part. The dangerous part came when I tried to support compositing as this
>> needed some OUT_RING and OUTREG commands. I based this on the KAA approach:
>> http://cgit.freedesktop.org/xorg/xserver/tree/hw/kdrive/ati?id=0cd662ea80579c317d706ebe04971bb29d0f9b4f
>> r128_composite.c. When I was finished, I ended up with an r128_exa.c file
>> that looked like this: http://pastebin.com/HHsESs9q. Now I realize that I
>> may be asking a difficult question as many of you haven't worked on this
>> stuff for years, but here's how it went.
>>
>> I installed this new DDX and first tried to start the X server with EXA and
>> DRI but the composite extension turned off. Just to make sure that
>> everything that worked before still worked. However, it did not work. It
>> gave me this log file: http://pastebin.com/XWYbtTcL showing the lines:
>> R128(0): R128CCEWaitForIdle: (DEBUG) CCE idle took i = 1025 and R128(0):
>> Idle timed out, resetting engine...
>>
>> I then stupidly thought "okay, maybe the composite code that has been added
>> hangs if composite is disabled so I'll enable composite as well as EXA and
>> DRI" Doing that gave me the log file: http://pastebin.com/3RJBNhAx with the
>> lines: (**) R128(0): Idle timed out: 64 entries, stat=0x80400040,
>> probe=0x00200000, (EE) R128(0): Idle timed out, resetting engine..., (EE)
>> R128(0): R128WaitForIdle: CCE stop -22, (EE) R128(0): R128WaitForIdle: CCE
>> reset -22 and (EE) R128(0): R128WaitForIdle: CCE start -22
>>
>> I get the same error if I switch back to XAA with DRI on. If I use XAA with
>> DRI off, I get the log file: http://pastebin.com/VLEbsAfx with the line:
>> (EE) R128(0): Idle timed out, resetting engine... As of this morning, the
>> ONLY way for me to use X, regardless of what my package versions are is to
>> specify "NoAccel" which is slow as hell. I've seen R128 infinite loop idling
>> errors posted to mailing lists in the past. They are mostly posted by PPC
>> users for whom the r128 driver never worked properly in the first place.
>> They all say that they "fix" it by turning off acceleration. I obviously
>> don't want this since my card worked perfectly with 2D and 3D acceleration
>> for years and it's only my foolishness right now that broke it.
>>
>> Does anyone know a way this can be fixed? Does ATI provide a tool that will
>> reflash a VBIOS and get rid of this error? Or does one of you have / know
>> how to write this kind of utility? Is it common to permanently brick a GPU
>> if you try writing the wrong bit to a register? Like the ones that are
>> written to in my r128_exa.c? If so I have even more respect for open source
>> driver hackers.
> There's not really an easy way to wreck your vbios from bad
> acceleration code. Moreover, all the vbios handles is modesetting and
> card posting. The vbios does not load the cce ucode or touch anything
> acceleration related. If you get something on the screen at boot,
> then most likely your vbios is fine. I would guess that your system
> is not using the ddx (xf86-video-r128) you think it is or your patches
> made some subtle change that broke XAA. Does reverting the stock
> versions of the graphics stack provided by your distro help?
>
> Alex
At the time, reverting to default versions didn't help. But the problem
has been resolved. I had to unplug my computer for a few hours. I guess
there was corrupted memory that needed time to fade away. So, no need to
read all the pastes that I posted... yet.
The idiocy that caused this was shifting the DATATYPEs 16 bits to the
left. This was explicitly done in the KAA code but the new r128_reg.h
already accounts for this shift in the definitions of the DATATYPEs. So
I now got the KAA code to work but I'm beginning to think it never
worked very well. I will try making it less restrictive about what
pixman formats it allows and report back with my findings.
More information about the xorg-devel
mailing list