[Xcb] [ANNOUNCE] xcb-util 0.3.9

Thu Jun 7 01:22:47 PDT 2012

On Jun 6, 2012, at 15:13, Josh Triplett <josh at joshtriplett.org> wrote:

> On Wed, Jun 06, 2012 at 02:10:47PM -0700, Jeremy Huddleston wrote:
>> On Jun 6, 2012, at 4:04 AM, Josh Triplett <josh at joshtriplett.org> wrote:
>>> On Tue, Jun 05, 2012 at 10:03:44PM -0700, Jeremy Huddleston wrote:
>>>> On Jun 5, 2012, at 6:35 PM, Josh Triplett <josh at joshtriplett.org> wrote:
>>>>> I agree with your statement that from a functional standpoint this holds
>>>>> true: the linker doesn't seem to enforce the minor version rule, so you
>>>>> can build against a newer library and run with an older one, or vice
>>>>> versa, as long as the major version matches.  The linker will complain
>>>>> if you use a symbol that it can't resolve, though.
>>>> 
>>>> As it should (well unless the symbol is weak and can be checked for at
>>>> runtime).
>>> 
>>> True, though for most symbols that doesn't make sense, since you'd have
>>> to write things like "if (!xcb_useful_function) aieee();". :)
>> 
>> Well, if aieee() is really needed, then you wouldn't check for it, you'd
>> just use it and bail out with the dynamic linker complaining that it
>> couldn't resolve.
> 
> You don't necessarily have that option with a weak symbol.  Unless you
> mean that the program can, on a symbol-by-symbol basis, choose whether
> to use the symbol as weak or non-weak?  That seems feasible, but not
> compatible with how most library header files normally define function
> prototypes.

Well, it's how we do it on OS X.  For example:

ssize_t getdelim(char ** __restrict, size_t * __restrict, int, FILE * __restrict)
__OSX_AVAILABLE_STARTING(__MAC_10_7, __IPHONE_4_3);

getdelim will be weak if the developer sets OS X 10.6 as a deployment target.
If 10.7 or later is used as a deployment target, the symbol will not be weak.
__OSX_AVAILABLE_STARTING is essentially a macro that evaluates to nothing or
__attribute__((weak_import)).

> 
>> If you can avoid using it, you would do something like:
>> 
>> if (strlcpy) {
>>   strlcpy(...);
>> } else {
>>   strncpy(...);
>>   ...;
>> }
> 
> In that case, I'd suggest just using strncpy unconditionally, or writing
> your own version of strlcpy with a compatible interface and linking it
> in if libc doesn't have one.

strncpy is a contrived example, and yes in that case, you probably want to just use strncpy instead.

if (optimized_new_codepath) {
   optimized_new_codepath(...);
} else {
   less_optimal_codepath(...);
}

>  I tend to subscribe to the Linux kernel's
> style of never including #ifdef in .c files, and I consider code like
> the above gross for similar reasons; it strongly suggests the need for
> an abstraction layer to not have to deal with that at each call site.

Well then your abstraction layer will need to do that exact same logic.

>>> Better to use symbol versioning or similar to let the dynamic linker
>>> tell you at the start of your program that a symbol doesn't exist.  
>> 
>> Why?  If you can do something in the case that it doesn't exist, that
>> should be an option.
> 
> That falls under the case I mentioned below ("Or, if you really have
> written your program so that you can cope with the absence of some
> functionality,").  That doesn't represent the common case, though.

It's a very common case for OS X developers that want to support new
technologies on newer OS versions but still want to run on older OS
versions.  For example, when Snow Leopard first came out, developers
were able to use this functionality to use their new dispatch codepath
if GCD was available and pthreads if it was not.

>>> Or,
>>> if you really have written your program so that you can cope with the
>>> absence of some functionality, consider either using dlopen/dlsym to
>>> make that explicit or otherwise having a way to easily verify the
>>> functionality you need without having to test every symbol for NULLness.
>> 
>> No, that's a horrible solution.  The code snippet above (for strlcpy)
>> makes it easily accessible for developers.  Doing something with dlsym
>> is ugly in comparison (and IMO would just cause developers to NOT use
>> the new functionality):
>> 
>> size_t (*strlcpy_maybe)(char *, const char *, size_t);
>> strlcpy_maybe = dlsym(RTLD_DEFAULT, "dlsym");
>> if (strlcpy_maybe) {
>>   strlcpy_maybe(...);
>> } else {
>>   strncpy(...);
>>   ...;
>> }
> 
> People expect that dlsym might fail.  For the most part, people *don't*
> expect that a function defined in a header file might point to NULL;
> they'll just call it, and segfault when it points to NULL.  Plus, if you
> use dlopen/dlsym, you can cope with the complete absence of a library on
> the system.

Well then if they want to use a function unconditionally that is first
provided in libA.1.2, then they can't run on libA.1.1.  If they want to use
that functionality *AND* be able to run on libA.1.1, they need to do
something, and developers seem to have been fine with that case for the past
decade or so...

> If you define an entirely new interface, you can define it using weak
> symbols, but programmers will still trip over it unless you provide a
> convenient way to say "no, really, I don't want to do runtime
> detection, I just want to refuse to run if the functionality I expect
> doesn't exist".

Yes, that is why we use a macro to determine if it's weak or not.  If the
developer says, no I want to use this and can't run without it, they set
the deployment target to a newer OS version.  If they want to support OS
versions without the functionality, they set it appropriately, and the
symbols are weak.

> Among other things, I'd rather have an interface like the syscall
> interface, where calling a non-existent syscall *works* and produces
> ENOSYS.

>  Then, code that always says "die if the syscall fails" will
> die, and code that uses the syscall for optional functionality will
> gracefully fall back.  More importantly, I then don't need a conditional
> at every callsite.

Well you should be checking the return value and acting appropriately.  In
your case, you still need to do:

if (syscall(...) == ENOSYS) {
    do_fallback();
}

>>>>> In particular, the minor version serves as a hint to the programmer that
>>>>> if they link against libABC.so.1.1, they might or might not successfully
>>>>> run against libABC.so.1.0, depending on what symbols they used.
>>>> 
>>>> IMO, that should be annotated in header files in a way that allows those
>>>> symbols to be weak linked and checked for at runtime (and thus go down an
>>>> alternative codepath if unavailable).
>>> 
>>> Not unless that gets wrapped in some kind of interface that avoids the
>>> need to check all used symbols against NULL before using them; I'd
>>> prefer to make that the dynamic linker's job.
>> 
>> Yes, the dynamic linker will bail on you if you try to actually *use* the
>> symbol, but you still need to check it.  I'm not sure what kind of interface
>> you want.  This seems rather straightforward to me:
>> 
>> #if I_WANT_TO_SUPPORT_OLD_LIBRARY_VERSIONS_WITHOUT_STRLCPY
>> if (strlcpy) {
>>   strlcpy(...);
>> } else
>> #endif
>> {
>>   strncpy(...);
>>   ...;
>> }
> 
> No, I'd rather the interface looked like this:
> 
> strlcpy(...);
> 
> Or, if I don't want to count on that, and I don't want to provide a
> compatible strlcpy replacement via autoconf or similar:
> 
> strncpy(...);
> extra_pile_of_ugly(...);
> 
> That applies to the case where I need something with strlcpy's
> functionality unconditionally, and only the implementation varies.  In
> the case of something like XInput 2 support, a library that wants to use
> XInput 2 iff available could use dlsym to use it conditionally (in which
> case they work even with the library unavailable).

dlsym is a horrible interface for clients of a library.  They should be
able to just use the real symbols, and the dlsym-cruft should be "done" by
the dynamic linker.

> Such a library could
> also use weak symbols, though that seems both more error-prone and more
> difficult to specify in a header file (you'd need either separate
> headers for weak and non-weak usage or some kind of #define WEAK_XI2
> before including the header file).

You don't need two header files, you just need something like the OS X
availability macros which handle it based on the version the symbol
entered in, and the version that we want to support running on.

> Personally, I'd rather have a wrapper interface that looks like
> unconditional function calls with error handling, rather than function
> calls conditional on the function pointer itself.  Any approach that
> makes every program and library author write their *own* wrappers seems
> like a problem; why force everyone to write duplicate code for the
> common case?

I would argue that it's not the common case.  The common case is probably supporting the most recent version of the library and not caring about running on older versions.  Also, the error-case is not always trivial, and I prefer to not have "code" in header files.

> But in preference to all of those approaches, I'd rather just require
> XI2 support in the library, and only conditionally handle the case where
> the server doesn't have it.

Yeah...

> In any case, this seems like a far tangent from the issue of removing
> symbols from a library. :)

Yeah, but I often get on tangents.  You often get on tangents... I guess we're rather obtuse individuals, although Nancy sometimes calls me acute...

Ok, stopping now...

>> Symbol versioning is very useful for dealing with the flat namespace
>> problem.  For example, consider an application that links against libA
>> and libB.  libA links against libC.1 and libB links against libC.2. 
>> Both libC.1 and libC.2 provide different versions of funC().  In a flat
>> namespace without versioning, this situation would not work. funC at LIBC
>> would collide.  ELF solves this by versioning the symbols in the global
>> symbol list.  On OS X, we use a 2-level namespace, so versioning isn't
>> necessary.
> 
> Interesting.  So, libA will reference "funC in libC.1" and libB will
> reference "funC in libC.2", using a namespacing mechanism orthogonal to
> variants?

Yes.  We support flat namespace, but it's not default, and it's not
recommended.

When resolving a symbol in libB, it will only see symbols from its
dependencies.  In fact, the Mach-O load commands essentially specify the
full path to the dylib that provides the symbol, so even if I have
/usr/lib/libB.1.dylib and /usr/local/lib/libB.1.dylib, they can both be
used at the same time by different libraries.  There can be problems if
clients assume that they can pass objects from one into the other (eg, a
client of libB passing an objBPtr managed by /usr/lib/libB.1.dylib to
another client of libB which was using /usr/local/lib/libB.1.dylib), but
that isn't something hit too much in practice.