log file
Theodore Kilgore
kilgota at banach.math.auburn.edu
Tue Sep 11 10:54:41 PDT 2012
The log file which is below was taken from my office machine,
banach.math.auburn.edu, on which the problem is a lockup and apparent
kernel panic when attempting to exit from an X session. Said lockup
completely blacks the screen, and there is also no response to the
keyboard (that is, no command seems to work, including the Cntrl-Alt-Del
request for a reboot).
Following advice, I determined that the network was still up and I was
able to log into the machine from a computer in another office in the
building. The result is seen below, consisting of the tail end of
/var/log/syslog.
I had just installed the stock Slackware64-current "huge" kernel before
trying this experiment. The version of that kernel is 3.2.28. I have been
having this problem of lockup since, approximately, kernel version 3.0 and
have been running an older kernel in order to avoid the problem happening.
AFAICT no other log file seemed to have anything in it which was in any
way untoward or strange. That includes the files /var/log/debug and
/var/log/messages and also /var/log/dmesg which all recorded absolutely
nothing relevant to the event. And the Xorg.log seems not to be saying
anything, either, other than X has been exited. That is, said file ends
exactly the same way as it does when things are working normally.
So as I said these lines right here are apparently the only possible
record of a problem having occurred.
---------- Forwarded message ----------
Date: Mon, 10 Sep 2012 20:38:14 +0000
From: Stephen Stuckwisch <STUCKSE at auburn.edu>
To: Theodore Kilgore <KILGOTA at auburn.edu>
Subject: log file
Sep 10 15:35:36 banach kernel: [ 2.647999] ACPI: Invalid passive threshold
Sep 10 15:35:36 banach kernel: [ 2.693757] k8temp 0000:00:18.3:
Temperature readouts might be wrong - check erratum #141
Sep 10 15:35:36 banach kernel: [ 3.050928] SP5100 TCO timer: mmio address
0xfec000f0 already in use
Sep 10 15:35:36 banach kernel: [ 4.443877] [drm:r100_ring_test] *ERROR*
radeon: ring test failed (scratch(0x15E4)=0xCAFEDEAD)
Sep 10 15:35:36 banach kernel: [ 4.443941] [drm:r100_cp_init] *ERROR*
radeon: cp isn't working (-22).
Sep 10 15:35:36 banach kernel: [ 4.444482] radeon 0000:01:05.0: failed
initializing CP (-22).
Sep 10 15:35:36 banach kernel: [ 4.444531] radeon 0000:01:05.0: Disabling
GPU acceleration
Sep 10 15:35:36 banach kernel: [ 4.568683] [drm:r100_cp_fini] *ERROR* Wait
for CP idle timeout, shutting down CP.
Sep 10 15:35:36 banach kernel: [ 4.692843] Failed to wait GUI idle while
programming pipes. Bad things might happen.
Sep 10 15:35:36 banach kernel: [ 6.207192] reiserfs: enabling write
barrier flush mode
Sep 10 15:35:36 banach kernel: [ 68.271542] reiserfs: using flush barriers
Sep 10 15:35:36 banach kernel: [ 68.365148] reiserfs: using flush barriers
Sep 10 15:35:36 banach kernel: [ 68.490421] reiserfs: using flush barriers
Sep 10 15:35:36 banach kernel: [ 68.585381] reiserfs: using flush barriers
Sep 10 15:35:36 banach kernel: [ 68.676188] reiserfs: using flush barriers
Sep 10 15:35:45 banach console-kit-daemon[1939]: WARNING: Failed to
acquire org.freedesktop.ConsoleKit
Sep 10 15:35:45 banach console-kit-daemon[1939]: WARNING: Could not
acquire name; bailing out
The video chip in the office machine, recall, is a Radeon RS690, and the
kernel error message seen above is found in
linux/drivers/gpu/drm/radeon/rs690.c
among other files in the same directory relating to other Radeon chips
besides this one.
Specifically, we have from rs690.c
static void rs690_gpu_init(struct radeon_device *rdev)
{
/* FIXME: is this correct ? */
r420_pipes_init(rdev);
if (rs690_mc_wait_for_idle(rdev)) {
printk(KERN_WARNING "Failed to wait MC idle while "
"programming pipes. Bad things might happen.\n");
}
}
Again, this information relates to banach.math.auburn.edu which is an
older machine than the home machine, on which the panning feature fails to
work after switching to a lower resolution.
Further comments:
In the older kernels the support for the Radeon RS690 chip seems to be
really broken, perhaps so broken that things just worked anyway. With the
older kernels, my log files filled up with spam about problems
implementing this or that thing about the video, messages repeated ad
nauseam. Here are some samples of such oft-repeated messages
Sep 9 04:40:08 banach kernel: [12224272.021068]
Sep 9 04:40:08 banach kernel: [12224272.071617]
[drm:drm_edid_block_valid] *ERR
OR* Raw EDID:
Sep 9 04:40:08 banach kernel: [12224272.071637]
Sep 9 04:40:08 banach kernel: [12224272.122021]
[drm:drm_edid_block_valid] *ERR
OR* Raw EDID:
Sep 9 04:40:08 banach kernel: [12224272.122042]
Sep 9 04:40:08 banach kernel: [12224272.171957]
[drm:drm_edid_block_valid] *ERR
OR* Raw EDID:
Sep 9 04:40:08 banach kernel: [12224272.171977]
Sep 9 04:40:08 banach kernel: [12224272.171981] radeon 0000:01:05.0:
HDMI-A-1:
EDID block 0 invalid.
Sep 9 04:40:08 banach kernel: [12224272.171984] [drm:radeon_dvi_detect]
*ERROR*
HDMI-A-1: probed a monitor but no|invalid EDID
Sep 9 04:40:18 banach kernel: [12224282.261150]
[drm:drm_edid_block_valid] *ERR
OR* Raw EDID:
Sep 9 04:40:18 banach kernel: [12224282.261172]
Sep 9 04:40:18 banach kernel: [12224282.311168]
[drm:drm_edid_block_valid] *ERR
OR* Raw EDID:
Sep 9 04:40:18 banach kernel: [12224282.311188]
Sep 9 04:40:18 banach kernel: [12224282.361084]
[drm:drm_edid_block_valid] *ERR
OR* Raw EDID:
Et cetera, et cetera, et cetera.
With the 3.2.28 kernel, this has quit. But after collecting that log
excerpt from the tail end of syslog, yesterday, I found some complaints
earlier in the boot sequence, too. For example, these lines from
/var/log/dmesg:
[ 4.485049] [drm] radeon: ring at 0x00000000A0001000
[ 4.620264] [drm:r100_ring_test] *ERROR* radeon: ring test failed
(scratch(0x
15E4)=0xCAFEDEAD)
[ 4.620816] [drm:r100_cp_init] *ERROR* radeon: cp isn't working (-22).
[ 4.620868] radeon 0000:01:05.0: failed initializing CP (-22).
[ 4.620917] radeon 0000:01:05.0: Disabling GPU acceleration
[ 4.621407] [drm] radeon: cp finalized
And there are some similar lines in syslog, too, which seem to have
originated during boot of the new 3.2.28 kernel
Sep 10 15:58:38 banach kernel: [ 4.620264] [drm:r100_ring_test] *ERROR*
radeon: ring test failed (scratch(0x15E4)=0xCAFEDEAD)
Sep 10 15:58:38 banach kernel: [ 4.620816] [drm:r100_cp_init] *ERROR*
radeon: cp isn't working (-22).
Sep 10 15:58:38 banach kernel: [ 4.620868] radeon 0000:01:05.0: failed
initializing CP (-22).
Sep 10 15:58:38 banach kernel: [ 4.620917] radeon 0000:01:05.0:
Disabling GPU acceleration
So it seems that I may have a kernel problem, possibly not an ATI or
Radeon driver problem. But just in case it is relevant:
The version of X that I am running on banach.math.auburn.edu is seen in
the name of the main package for it, xorg-server-1.9.5-x86_64-1.txz
I should remark, though, that I have in the last few months tried to use
more recent versions of X (Slackware-current is a "rolling" release of
updates) and my memory is that I had the same problem of an apparent
panic on exiting X. And in addition to that I had the problem
about panning being killed off after dropping down the resolution (same
problem as the home machine) which caused me to avoid using the newer X
packages because of a very bad user experience.
So, I am hoping that the log file excerpts above will help in finding what
the problem is.
Theodore Kilgore
More information about the xorg-driver-ati
mailing list