Problem with mapping a key to multiple characters (Unicode + diacritic symbol)

Thu Mar 14 19:53:33 UTC 2019

On Thu, Mar 14, 2019 at 07:42:14PM +0100, Pierre-Luc Angles wrote:
> Dear Ilya, dear all,
> 
> Thanks again.
> 
> My full xkb mapping is the following. I have not added all the Compose keys
> but just i_breve_below to test it (note that it is not a i without dot but a
> normal i with a breve below and that the i withoutdot is combined with a
> half ring above.
> 
> partial alphanumeric_keys
> xkb_symbols "oss" {
> 
>     include "latin"
>     include "level3(ralt_switch)"
>     include "nbsp(level4n)"
>     include "keypad(oss)"
> 
>     name[Group1]="translitt";
> 
>     // First row
>     key <TLDE>	{ [      	   U1E95,    		 U1E94,          	   U021D,   	
> U021C ] }; // ẕ Ẕ ȝ Ȝ
>     key <AE01>	{ [        ampersand,                1, i_breve_below, 	 	
> U032F ] }; // & 1 i+̯  U0069+U032F i and Combining Inverted Breve Below and
> dead inverted breve below  auparavant on trouvait deadUnderscoreacute au
> lieu de U032F

Dear Pierre-Luc,

I see lots of cool and unusual glyphs in your keymap. It also
appears that combining characters join to the previous character,
while "conventional" dead keys seem to apply to the following
character (at least for latin text).

One of the "fun" things of programming is what we learn of our data,
and particularly the need from time to time to try a different
approach.  AFAICS, at the moment you have not indicated what sort of
keyboard you and your users will be using (German qwertz ? American
qwerty ?  French azerty ?) - that may make a difference about how
users are accustomed to accessing certain things, and also about
what keys could be conveniently used for accessing compose
sequences.

More importantly, we do not know which letters and diacriticals you
actually *need*.  In your first post you wrote:

<i_breve_below> : "i̯"
<u_breve_below> : "u̯"
<ı_ring_above> : "ı͗"
<I_ring_above> : "I͗"
<č_dot_below> : "č̣"
<Č_dot_below> : "Č̣"
<s_macron_below> : "s̱"
<S_macron_below> : "S̱"
<H_macron_below> : "H̱"
<h_circumflex_below> : "h̭"
<H_circumflex_below> : "H̭"

but I think we have established that the breve below is actually an
inverted breve and the ring is a right partial ring (no idea of the
correct name, but it looks different from the precomposed hook).

When I read those in my mail reader (urxvt, with selected TTF
monospace fonts, and vim for editing), combining diacriticals are
either below the initiual double-quote, or between that and the
letter, but when I paste them into libreoffice writer to view them at
a much larger size, the positions vary depending on the font - DejaVu
Sans looks good, DejaVu serif too (apart from a very thing "ring"),
but Tex Gyre Heros and Droid Sans Fallback put some of them at the
right of the letter.

And that is before trying to work out "which font is providing
this glyph ?" in libreoffice.

So, for combining characters you will need to agree on a font ;-)

But there is a more basic question, prompted by your use of Yogh.
I have come across that on wikipedia in relation to anglo-saxon and
Scots, and I'm sure there are other uses - but it doesn't seem
likely to be used for transcribing ancient Egyptian.

No one keymap can *conveniently* map everything.  Exactly which
characters do you need to use for the Egyptian, and with which
modern language(s) ?

ĸen
-- 
  It is said that there are two great unsolved problems in computer
  science: naming, cache invalidation, and off-by-one errors.
                         -- Ben Bullock