libpciaccess ROM read is thrice broken (was: X Hangs at "Initializing int10") try 2
Alex Villacís Lasso
a_villacis at palosanto.com
Fri Dec 5 14:08:13 PST 2008
Alex Villacís Lasso escribió:
> Resending as previous attempt complains attachments are too big.
>
> All of the following applies to the stock linux 2.6 kernel from a fresh
> installation of Fedora 10.
>
> I have been looking into the int10 hang when initializing the BIOS of
> a secondary card. This is what I found:
> - The function responsible for reading the ROM of the PCI video card
> is pci_device_linux_sysfs_read_rom() for the Fedora 10 case.
> - This function pci_device_linux_sysfs_read_rom() is *not* exercised
> at all when using the primary display, even when an option such as
> UseBIOS is in effect. So this function might as well be broken and
> nobody with a single display would notice.
> - pci_device_linux_sysfs_read_rom() is exercised when initializing a
> secondary display (using "vesa" in my case)
>
> I introduced a bit of logging in the patch
> libpciaccess-partial-fix-with-debug.patch that outputs messages to a
> file in /tmp . The basic problem is that, despite all the sysfs dance
> to enable the ROM, the kernel terminates the read with 0 bytes when
> trying to read the ROM:
>
> Reading ROM from /sys/bus/pci/devices/0000:00:09.0/rom into address
> 0xb7f4a008
> ROM size for /sys/bus/pci/devices/0000:00:09.0/rom is 32768 using 32768
> Reading ROM from /sys/bus/pci/devices/0000:00:09.0/rom reached 0-sized
> read (EOF?) at offset 0
> Dump of ROM from /sys/bus/pci/devices/0000:00:09.0/rom (0 bytes):
> Reading ROM failed with short read, using /dev/mem to read from
> 0xdffe0000
>
> I introduced an attempt at a fallback that calls
> pci_device_linux_devmem_read_rom() when the total amount read is less
> than the expected ROM size. In current git for libpciaccess, the
> buffer remains uninitialized and hangs the machine. I hoped that the
> fallback would be enough to read the ROM and fix this problem.
> However, I ran into another problem. The attempted fallback ends up
> using pread() on /dev/mem at the offset matching the one reported for
> the ROM. However, this failed with EINVAL (Invalid argument). By using
> strace on the stock X server and the modified libpciaccess library, I
> saw that the pread implementation calls into pread64() with an very
> big offset of 18446744073172549632 (0xffffffffdffe0000), which is the
> required offset, sign-extended into 64 bits instead of zero-extended
> as required. This might point to a bug in glibc headers or code, but I
> worked around this by replacing the call with a pread64() call, as
> seen in libpciaccess-partial-fix-without-debug.patch
>
> Now, here comes the third problem: the passed address makes pread64()
> return EFAULT (Invalid address). I did not have time to find out
> whether this address is intended or not. However
> libpciaccess-partial-fix-without-debug.patch is enough to replace the
> hang with a graceful exit that allows the user to sort-of regain
> control of the machine. Final strace is attached, search for EFAULT in
> the text.
>
> Please comment on this. Details already posted at bug #18160
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> xorg mailing list
> xorg at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/xorg
The failure to read the PCI ROM as documented seems like a kernel bug. I
opened http://bugzilla.kernel.org/show_bug.cgi?id=12168 for it.
--
perl -e '$x=2.4;print sprintf("%.0f + %.0f = %.0f\n",$x,$x,$x+$x);'
More information about the xorg
mailing list