Weak symbols that aren't

Joseph Parmelee jparmele at wildbear.com
Fri Aug 11 16:07:18 PDT 2006


Hello again:

I am not an X developer and have my own problems to work on which
have nothing to do with X.  But I needed to get my system running
again after closing the latest freetype vulnerability, and I will
share with you the results obtained, hopefully so that others can
accomplish this in less time.

The freetype problem appears to be solved by libXfont-1.2.0, but
libXfont has at least two significant issues which are waiting to
bite developers/users:

1.  The use of weak symbols in libXfont, however elegant in
concept, will not work.  The weak attribute on definitions only
applies at compile-time.  You won't get the desired effect between
modules from the runtime dynamic linker.  From Ulrich Drepper in
2000:

     I just checked in a patch which brings the handling of weak
     definitions at runtime in line with the official[*] ELF
     interpretation.  That is weak symbol definitions are treated like
     normal definitions during dynamic linking.  The weak attribute is
     only used in the static linker (and for references).

     This came up some time ago during the ELF discussions HJ and I
     are participating in.  I cannot think of a program which would
     have problems because of this change.  In case there is one the
     user can switch back to the old behaviour by having
     LD_DYNAMIC_WEAK in the environment.

     The reason I've made this change is not really to be more
     conformant (it's nice but no reason since nobody complained about
     the old behaviour).  *The real reason is that some more features
     (like lazy loading) depend on this interpretation of weak
     definitions.*

Emphasis mine.  This policy makes sense when you consider that
implementing static weak symbol semantics at runtime would largely
defeat the advantages of late binding.

Incidentally, LD_DYNAMIC_WEAK alone does not appear to work, at
least on my system.  Perhaps it does if one also disables lazy
loading.  But that is clearly not what you want to have to do to
the system in order to run X.  It has only a possible diagnostic
value which is provided in a vastly more convenient form by ldd -r.

I think the way to get a standalone debugging stub framework
included in each module is to add a configure option that links in
the stub libraries and whatever instrumentation you want for
testing, but which defaults off, so the production library has its
externally linked global symbols properly undefined.


Unfortunately, exporting weak test stub symbols has only the
practical effect of creating multiple definitions that are not
detected at runtime.  Because the dynamic linker simply ignores the
weak attribute, it is generally impossible to predict which version
of the symbol will end up getting loaded as it depends on the order
in which sections of the code execute.  This can lead to some quite
subtle bugs which may well not be reproducible from one system to
another.  This is the price one pays for lazy loading.

I found this while chasing a crash bug which turned out to be in
the tdfx driver (see my previous post for a patch).  X crashes
after a while due to late binding of an undefined symbol (getsecs). 
This would have been easy to find had the stub null function for
ErrorF in the libXfont package not been loaded instead of the real
ErrorF function defined in the xserver, and thus no error message
gives a clue in either the log file or on stderr.

I eventually found the problem (after many many many system
reboots) by redirecting the stderr from a stripped-down test
version of xinit to a file and finally catching the missing symbol
error message from libdl itself.  I uncovered the multiple
definition problem caused by the weak symbols when I sought to
understand why I didn't get a proper error message in the X log
file.


2.  As I reported earlier, there is a version number regression
with respect to the libXfont in the previous xorg6.9.0 release. 
Because the proper links are installed by the build, this problem
doesn't show up until you run ldconfig with the old library still
in place.  Then you will revert to the older version and continue
to have crashes due to the freetype-2.2.x incompatibilities.


Both of these problems are at best producing unnecessary confusion
and time waste.  If the practice of weak symbols is continued, it
will surely eventually build in a nasty hard-to-find bug, if it has
not alredy done so.


I can submit my patch to fix up the build of libXfont, but it will
be minimal just to get my system running.  The libXfont in 6.9.0
has version 1.5.  Unless I hear otherwise, I will submit the patch
against libXfont-1.2.0 to change its shared library version to
1.5.1, and to remove all the debugging stub files from the link. 
This is certainly not a real fix, but I simply do not have the time
right now to add --enable-testing-mode (or a better name) to the
configure options.


Incidentally, other packages also have a similar version regression
problem with repect to previous xorg and Xfree86 releases.

Current (Fri Aug 11 2006) xorg libraries with version regressions:

Library       From package              Version   Latest previous

libXfont      libXfont-1.2.0            1.4.1     1.5   xorg-6.9.0
libICE        libICE-X11R7.1-1.0.1      6.3.0     6.4   xorg-6.9.0
libxrx        xrx-1.0.1                 0.0.0     6.8   xorg-6.9.0
librxrnest    xrx-1.0.1                 0.0.0     6.8   xorg-6.9.0
libXfontcache libXfont-1.2.0            1.0.0     1.2   Xfree86-4.5.0
libXxf86dga   libXxf86dga-X11R7.1-1.0.1 1.0.0     1.1   Xfree86-4.5.0

Because the major version numbers are different, the library
symlinks for libxrx and libxrxnest are not disturbed by ldconfig. 
For the others however, running ldconfig resets the links to the
previous version, if present.


Regards,

Joseph





More information about the xorg mailing list