[Xcb] [ANNOUNCE] xcb-util 0.3.9

Tue Jun 5 18:35:24 PDT 2012

[Side note: your mails show up with ridiculously long uwrapped lines,
which makes them harder to reply to.]

On Tue, Jun 05, 2012 at 10:37:45AM -0700, Jeremy Huddleston wrote:
> On Jun 4, 2012, at 11:22 PM, Josh Triplett <josh at joshtriplett.org> wrote:
> > On Mon, Jun 04, 2012 at 04:24:10PM -0700, Jeremy Huddleston wrote:
> >> On Jun 4, 2012, at 3:25 PM, Josh Triplett <josh at joshtriplett.org> wrote:
> >>> On Mon, Jun 04, 2012 at 02:52:51PM -0700, Jeremy Huddleston wrote:
> >>>> On Jun 4, 2012, at 2:19 PM, Adam Jackson <ajax at redhat.com> wrote:
> >>>>> On Mon, 2012-06-04 at 14:03 -0700, Jeremy Huddleston wrote:
> >>>>>> On Jun 4, 2012, at 1:34 PM, Julien Cristau <jcristau at debian.org> wrote:
> >>>>>> Think about this from the libc perspective.  libc *may have* strlcat
> >>>>>> or not, but they're named the same because all functions in libc have
> >>>>>> consistent signatures.
> >>>>> 
> >>>>> A libc that had strlcat once, and then removed it, would no longer have
> >>>>> the same ABI.  An application that had successfully linked against the
> >>>>> old libc's strlcat would reasonably expect it to be present at runtime
> >>>>> too.
> >>>> 
> >>>> That argument breaks down when you reverse it.  The "rules" state that the SONAME should not change when adding APIs.  If all you're basing this on is SONAME, then there is absolutely no difference between the adding and removing case.  If I link against a "newer" libc which has strlcat, then by your argument, I'd expect strlcat to be present on any libc matching that SONAME.  When I run my application with the older libc without strlcat, it will fail to find it.
> >>> 
> >>> That represents the difference between major and minor version changes.
> >>> When you add a new function (or otherwise extend the ABI, such as by
> >>> adding new flags to a flags parameter), you increase the minor version,
> >>> so that applications built against the new library won't run with the
> >>> old one, but applications built against the old one (and thus not
> >>> expecting the new function) will still work with the new library.
> >>> However, when you *remove* a function, applications built against the
> >>> old library will not work with the new one, so you have to bump the
> >>> major version.
> >> 
> >> I guess this is where the "OS X" paradigm and the GNU paradigm just
> >> break down.  Is there actually annotation done to specify that a
> >> specific function was added for a given minor version bump of a
> >> library?  Does the loader just require that the runtime version be >=
> >> the linktime version (that seems particularly dangerous to me)?  How
> >> is this actually enforced in practice?  My understanding was that the
> >> minor version was nothing more than extra bits as a guide to the user
> >> or packager and that there wasn't actually any "real" mechanism in
> >> place to deal with this properly (ie weak linking the new symbols).
> > 
> > The dynamic linker has a couple of independent mechanisms to handle
> > this: the library major/minor version, and symbol versioning.
> > 
> > The major/minor version have a simple rule: a program or library linked
> > against version $linkmajor.$linkminor of libfoo can run with any libfoo
> > with $runtimemajor == $linkmajor and $runtimeminor >= $linkminor.  With
> > ELF on Linux, the minor number requirement just exists as a convention,
> > not a dynamic linker requirement, though some libraries explicitly
> > ask the dynamic linker to enforce it using symbol versioning.
> > 
> > The dynamic linker *does* enforce the availability of symbols, though.
> > If your program or library links against libfoo and expects to find
> > foo_blah, and a newer libfoo removes foo_blah, the linker will complain
> > and not load your program or library.
> 
> > If you want to do something more complicated, you can use symbol
> > versioning, which lets you say things like "a program expecting the
> > foo_blah symbol from libfoo.so.1.2 should use old_foo_blah instead; a
> > program expecting foo_blah from libfoo.so.1.3 should use foo_blah".
> > glibc does a ton of that, to give it much more fine-grained versioning
> > so that programs built against one version of glibc will run against the
> > widest possible selection of glibc versions.  More simple examples
> > include enforcing the minor version rule, which almost every library
> > using symbol versioning enforces: programs linking against libfoo.so.1.3
> > will want various foo_* symbols from libfoo.so.1.3, and only
> > libfoo.so.1.3 and newer minor versions will have mappings for those
> > symbols; older minor versions won't have those mappings, so the linker
> > knows to fail early.
> 
> Is this actually done in practice (outside of glibc)?  As you describe it, it's essentially exactly the same mechanism we use in OS X, but I haven't actually seen this done in X.org (or many other autotools/glibtool) projects.

A few common libraries do this; a quick glance on my system suggests
that ALSA, libbsd, openssl, and curl do.

> Take libXi for example, since it's probably fresh in most of our minds.  With Xi2, libXi added a ton of new functionality and remained backwards compatible.  Thus, the minor version of the library was bumped (http://cgit.freedesktop.org/xorg/lib/libXi/commit/?id=0d19a3ec942aedf5432a9bda1e80f29f7186ce5b) from 6.0 to 6.1.
> 
> As far as I can tell, though, symbols are not annotated with "available in 6.1 and later" so the only difference is the minor version bump, and as you said, Linux/ELF essentially ignores this ... so in this case, there is no *practical* difference between going forward with an added symbol and going backwards with a removed symbol.  To make this a bit more clear (words fail me, sorry), consider:
> 
> case 1:
> libABC.0.0 contains symbols a, b, c
> libABC.1.0 contains symbols a, b
> libABC.1.1 contains symbols a, b, c (c was added back in but not annotated as new in 1.1)
> 
> case 2:
> libABC.0.0 contains symbols a, b, c
> libABC.0.1 contains symbols a, b
> 
> In case 1, libABC.1.1 and libABC.0.0 are identical.  Since Linux/ELF
> ignores minor version, there is no effective difference between
> linking against libABC.1.1 and running with libABC.1.0 in case 1
> compared to linking against libABC.0.0 and running with libABC.0.1 in
> case 2.

I agree with your statement that from a functional standpoint this holds
true: the linker doesn't seem to enforce the minor version rule, so you
can build against a newer library and run with an older one, or vice
versa, as long as the major version matches.  The linker will complain
if you use a symbol that it can't resolve, though.

Linux distributions primarily want to avoid the scenario in which the
dynamic linker considers a library version acceptable but the program
requiring that library crashes or otherwise breaks at runtime.  That
doesn't happen with the addition or removal of symbols, because the
dynamic linker will attempt to resolve those symbols and complain if it
can't; in that scenario, the rule of bumping the major version when
removing functions represents a "don't do that" convention to avoid
having users surprised by programs that suddenly refuse to run with a
library that claims compatibility; upgrading to a newer version of a
library with the same major number should never cause existing programs
to stop working.

In particular, the minor version serves as a hint to the programmer that
if they link against libABC.so.1.1, they might or might not successfully
run against libABC.so.1.0, depending on what symbols they used.  On the
other hand, a programmer expects that a program linked against
libABC.so.1.0 should *always* work with libABC.so.1.1, but won't
necessarily work with libABC.so.2.0.  Removing or changing symbols
breaks that assumption; adding symbols doesn't.

Libraries typically introduce symbol versioning to handle more complex
cases that the linker can't catch on its own, namely a *change* to an
existing symbol, such as by adding a new parameter to a function, or
more subtly by changing the layout of a data structure used by a
function.  In that scenario, the linker will see no problem, but the
program will break at runtime, precisely as if you cast a function
pointer to a different type with different parameters and tried to call
it (because, effectively, you did).  Symbol versioning lets you maintain
ABI (though often not API) compatibility in this case, by having old
programs use a compatibility symbol that knows how to handle the old
calling signature.

However, the details of how much the dynamic linker enforces and how
much just occurs by convention don't change the general rules for how to
set library major and minor versions: always bump the major version if
you remove symbols or change the meaning of existing symbols.
Exceptions to that rule lead to unexpected breakage, even when
attempting to claim that a symbol doesn't represent public API, and such
exceptions should only occur with careful consideration of the breakage
they'll introduce.

- Josh Triplett