COMPOUND_TEXT versus UTF8_STRING

mcnichol at austin.ibm.com mcnichol at austin.ibm.com
Wed Sep 22 08:02:32 PDT 2004


> Date: Wed, 22 Sep 2004 14:26:56 +0100
> From: Markus Kuhn <Markus.Kuhn at cl.cam.ac.uk>
> Subject: Re: COMPOUND_TEXT versus UTF8_STRING
> 
> Sebastien wrote on 2004-09-22 09:07 UTC:
> > Where can I find a converter function or library which supports the
> > following string conversions:
> > - COMPOUND_TEXT to local encoding (defined by $LANG)
> > - local encoding (defined by $LANG) to COMPOUND_TEXT
> > - COMPOUND_TEXT to UTF8
> > - UTF8 to COMPOUND_TEXT
> 
> Which reminds me to bring up the underlying more fundamental question,
> namely the future of COMPOUND_TEXT.
> 
> COMPOUND_TEXT is an implementation of ISO 2022, a horrendously complex
> and impractical way of switching between multiple character sets within
> the same string, that clearly failed on the market place, and is no
> longer used today except for some CJK email. Mule Emacs used something
> similar for a while, but they are now moving to UTF-8 as the sole
> internal encoding for Emacs 23. All major web browsers have done the
> same long ago.
> 
> COMPOUND_TEXT is in my opinion obsolete, and we should start thinking
> about a way to smoothly deprecate it from the standard, and make the way
> free for universally replacing it with the so much simpler and more
> practical UTF8_STRING. ISO 2022 is dead, and so should COMPOUND_TEXT be.


I can't say I would disagree with this at least in theory.
However, AIX and I suspect others still use COMPOUND_TEXT.

> 
> At present, UTF8_STRING is allocated in the X.Org registry, but none of the
> X Standards mention it yet. Some start was made a while ago in this direction
> in XFree86, most notably
> 
>   http://www.pps.jussieu.fr/~jch/software/UTF8_STRING/
>   http://www.cl.cam.ac.uk/~mgk25/unicode.html#x11
> 
> and it would be nice to see this taken up in the X11 standards.
> 
> In particular, one question I am interested in is:
> 
> Can we simply allow the use of UTF8_STRING in properties such as
> WM_NAME, WM_ICON_NAME and WM_CLIENT_MACHINE in a future version of the
> ICCCM?


NO !! NO !! NO!!

Unless EVERYBODY changes at the exact same time, things will be very broken.


> 
> Or is it necessary to introduce separate properties along the lines of
> the _NET_WM_NAME, _NET_WM_ICON_NAME, etc. suggested in
> 
>   http://freedesktop.org/Standards/wm-spec/1.3/ar01s05.html
> 
> for reasonable backwards compatibility? What is the best practice here?
> 
> https://freedesktop.org/bugzilla/show_bug.cgi?id=271

New property names is definatley the way to go.
Both new and old will probably be required for some time to ensure compatibility.


Dan



More information about the xorg mailing list