log file

Tue Sep 11 10:54:41 PDT 2012

The log file which is below was taken from my office machine, 
banach.math.auburn.edu, on which the problem is a lockup and apparent 
kernel panic when attempting to exit from an X session. Said lockup 
completely blacks the screen, and there is also no response to the 
keyboard (that is, no command seems to work, including the Cntrl-Alt-Del 
request for a reboot).

Following advice, I determined that the network was still up and I was 
able to log into the machine from a computer in another office in the 
building. The result is seen below, consisting of the tail end of 
/var/log/syslog.

I had just installed the stock Slackware64-current "huge" kernel before 
trying this experiment. The version of that kernel is 3.2.28. I have been 
having this problem of lockup since, approximately, kernel version 3.0 and 
have been running an older kernel in order to avoid the problem happening.

AFAICT no other log file seemed to have anything in it which was in any 
way untoward or strange. That includes the files /var/log/debug and 
/var/log/messages and also /var/log/dmesg which all recorded absolutely 
nothing relevant to the event. And the Xorg.log seems not to be saying 
anything, either, other than X has been exited. That is, said file ends 
exactly the same way as it does when things are working normally.

So as I said these lines right here are apparently the only possible 
record of a problem having occurred.

---------- Forwarded message ----------
Date: Mon, 10 Sep 2012 20:38:14 +0000
From: Stephen Stuckwisch <STUCKSE at auburn.edu>
To: Theodore Kilgore <KILGOTA at auburn.edu>
Subject: log file

Sep 10 15:35:36 banach kernel: [ 2.647999] ACPI: Invalid passive threshold

Sep 10 15:35:36 banach kernel: [ 2.693757] k8temp 0000:00:18.3: 
Temperature readouts might be wrong - check erratum #141

Sep 10 15:35:36 banach kernel: [ 3.050928] SP5100 TCO timer: mmio address 
0xfec000f0 already in use

Sep 10 15:35:36 banach kernel: [ 4.443877] [drm:r100_ring_test] *ERROR* 
radeon: ring test failed (scratch(0x15E4)=0xCAFEDEAD)

Sep 10 15:35:36 banach kernel: [ 4.443941] [drm:r100_cp_init] *ERROR* 
radeon: cp isn't working (-22).

Sep 10 15:35:36 banach kernel: [ 4.444482] radeon 0000:01:05.0: failed 
initializing CP (-22).

Sep 10 15:35:36 banach kernel: [ 4.444531] radeon 0000:01:05.0: Disabling 
GPU acceleration

Sep 10 15:35:36 banach kernel: [ 4.568683] [drm:r100_cp_fini] *ERROR* Wait 
for CP idle timeout, shutting down CP.

Sep 10 15:35:36 banach kernel: [ 4.692843] Failed to wait GUI idle while 
programming pipes. Bad things might happen.

Sep 10 15:35:36 banach kernel: [ 6.207192] reiserfs: enabling write 
barrier flush mode

Sep 10 15:35:36 banach kernel: [ 68.271542] reiserfs: using flush barriers

Sep 10 15:35:36 banach kernel: [ 68.365148] reiserfs: using flush barriers

Sep 10 15:35:36 banach kernel: [ 68.490421] reiserfs: using flush barriers

Sep 10 15:35:36 banach kernel: [ 68.585381] reiserfs: using flush barriers

Sep 10 15:35:36 banach kernel: [ 68.676188] reiserfs: using flush barriers

Sep 10 15:35:45 banach console-kit-daemon[1939]: WARNING: Failed to 
acquire org.freedesktop.ConsoleKit

Sep 10 15:35:45 banach console-kit-daemon[1939]: WARNING: Could not 
acquire name; bailing out

The video chip in the office machine, recall, is a Radeon RS690, and the 
kernel error message seen above is found in 

linux/drivers/gpu/drm/radeon/rs690.c

among other files in the same directory relating to other Radeon chips 
besides this one.

Specifically, we have from rs690.c

static void rs690_gpu_init(struct radeon_device *rdev)
{
        /* FIXME: is this correct ? */
        r420_pipes_init(rdev);
        if (rs690_mc_wait_for_idle(rdev)) {
                printk(KERN_WARNING "Failed to wait MC idle while "
                       "programming pipes. Bad things might happen.\n");
        }
}

Again, this information relates to banach.math.auburn.edu which is an 
older machine than the home machine, on which the panning feature fails to 
work after switching to a lower resolution.

Further comments:

In the older kernels the support for the Radeon RS690 chip seems to be 
really broken, perhaps so broken that things just worked anyway. With the 
older kernels, my log files filled up with spam about problems 
implementing this or that thing about the video, messages repeated ad 
nauseam. Here are some samples of such oft-repeated messages

Sep  9 04:40:08 banach kernel: [12224272.021068] 
Sep  9 04:40:08 banach kernel: [12224272.071617] 
[drm:drm_edid_block_valid] *ERR
OR* Raw EDID:
Sep  9 04:40:08 banach kernel: [12224272.071637] 
Sep  9 04:40:08 banach kernel: [12224272.122021] 
[drm:drm_edid_block_valid] *ERR
OR* Raw EDID:
Sep  9 04:40:08 banach kernel: [12224272.122042] 
Sep  9 04:40:08 banach kernel: [12224272.171957] 
[drm:drm_edid_block_valid] *ERR
OR* Raw EDID:
Sep  9 04:40:08 banach kernel: [12224272.171977] 
Sep  9 04:40:08 banach kernel: [12224272.171981] radeon 0000:01:05.0: 
HDMI-A-1: 
EDID block 0 invalid.
Sep  9 04:40:08 banach kernel: [12224272.171984] [drm:radeon_dvi_detect] 
*ERROR*
 HDMI-A-1: probed a monitor but no|invalid EDID
Sep  9 04:40:18 banach kernel: [12224282.261150] 
[drm:drm_edid_block_valid] *ERR
OR* Raw EDID:
Sep  9 04:40:18 banach kernel: [12224282.261172] 
Sep  9 04:40:18 banach kernel: [12224282.311168] 
[drm:drm_edid_block_valid] *ERR
OR* Raw EDID:
Sep  9 04:40:18 banach kernel: [12224282.311188] 
Sep  9 04:40:18 banach kernel: [12224282.361084] 
[drm:drm_edid_block_valid] *ERR
OR* Raw EDID:

Et cetera, et cetera, et cetera.

With the 3.2.28 kernel, this has quit. But after collecting that log 
excerpt from the tail end of syslog, yesterday, I found some complaints 
earlier in the boot sequence, too. For example, these lines from 
/var/log/dmesg:

[    4.485049] [drm] radeon: ring at 0x00000000A0001000
[    4.620264] [drm:r100_ring_test] *ERROR* radeon: ring test failed 
(scratch(0x
15E4)=0xCAFEDEAD)
[    4.620816] [drm:r100_cp_init] *ERROR* radeon: cp isn't working (-22).
[    4.620868] radeon 0000:01:05.0: failed initializing CP (-22).
[    4.620917] radeon 0000:01:05.0: Disabling GPU acceleration
[    4.621407] [drm] radeon: cp finalized

And there are some similar lines in syslog, too, which seem to have 
originated during boot of the new 3.2.28 kernel

Sep 10 15:58:38 banach kernel: [    4.620264] [drm:r100_ring_test] *ERROR* 
radeon: ring test failed (scratch(0x15E4)=0xCAFEDEAD)
Sep 10 15:58:38 banach kernel: [    4.620816] [drm:r100_cp_init] *ERROR* 
radeon: cp isn't working (-22).
Sep 10 15:58:38 banach kernel: [    4.620868] radeon 0000:01:05.0: failed 
initializing CP (-22).
Sep 10 15:58:38 banach kernel: [    4.620917] radeon 0000:01:05.0: 
Disabling GPU acceleration

So it seems that I may have a kernel problem, possibly not an ATI or 
Radeon driver problem. But just in case it is relevant:

The version of X that I am running on banach.math.auburn.edu is seen in 
the name of the main package for it, xorg-server-1.9.5-x86_64-1.txz

I should remark, though, that I have in the last few months tried to use 
more recent versions of X (Slackware-current is a "rolling" release of 
updates) and my memory is that I had the same problem of an apparent 
panic on exiting X. And in addition to that I had the problem 
about panning being killed off after dropping down the resolution (same 
problem as the home machine) which caused me to avoid using the newer X 
packages because of a very bad user experience.

So, I am hoping that the log file excerpts above will help in finding what 
the problem is.

Theodore Kilgore