Help with making keyboard for French (Togo) including symbols from local languages

Fri Jul 24 13:53:45 PDT 2015

Ken,

I worked with Singhala, categorized as Indic. It is a language spoken in 
the small island Sri Lanka -- a mix of pure Singhala and Sanskrit. My 
entire effort was to make it useful for the people and natural to type. 
It is wonderful to have all sorts of esoteric letters, but how many 
programs accept them? Anyone that walked on this earth is familiar with 
English letters. Besides, the Latin-1 code set that extends English has 
a special place in the digital world. Whether we like it or not, 
programs are written targeting the developed countries whose people can 
afford to pay for them. The underlying codes of Latin-1 are one byte 
size. Anything beyond that is double or more.

Singhala has a double-byte Unicode code set. Its use is limited to web 
pages and emails.  They have specialists to type it. Double byte codes 
are not acceptable by older, but well developed programs that improve 
lives of ordinary people and help businesses in many ways.

If French is the Western language your groups are familiar with, I'd 
modify the AZERTY layout. We want them to be able to use their familiar 
keyboard, right? I'd next look at the speech.

With Singhala, I first assigned letters to the native sounds that 
correspond with the commonly understood sounds of English letters. e.g. 
't' as in get or tip, not as in the -tion ending. Then you will be left 
with the other sounds that do not relate to French sounds. There may be 
French keys that cannot relate to the native language. Singhala do not 
have sounds for q, w, z and x. Some Singhala sounds were assigned them. 
Singhala has dental t and d. I took þ and ð for them borrowing from Old 
English, also Icelandic. Then the 'a' in hat is a distinct sound in 
Singhala. That matched with OE æ.  'ch' as in church got c. The palatal 
nasal like Spanish Enye got ç (because 'ch'= c sound is palatal)

Then Singhala has long and short vowels. In Latin the length of a single 
vowel is a mora. The Dutch just double the letter to indicate double 
morae vowel. That is good for Singhala. e.g. Sanskrit: maaþra = Latin Mora).

The 'other' letters:
Singhala has post nasalized vowels like in English. (e.g. hung, sing). 
Shifting the vowel gets these with acite accent (e.g. Shift-U = hú, 
Shift-I = sí). Threre are 'other, other' letters of vowels that add a 
glottal postfix. Those get the umlaut. (e.g. Alt-U = ü, Alt-I = ï). We 
got rid of Western conventions of capitalizing and used them for 
pre-nasals (like 't' in American English printer).

Tye entire effort was to make the user use the regular keyboard with 
only few changes and to get the complete phonology into that framework. 
We have a compromised solution for Snghala that works on all three 
platforms.

So, that's what I did.

Good luck with your wonderful project to save languages.

On 7/24/2015 12:05 PM, Ken Moffat wrote:
> On Fri, Jul 24, 2015 at 12:30:24PM +0200, Mats Blakstad wrote:
>> Hi
>>
>> I've just started to learn about XKB. Sorry I post my question here if it
>> is the wrong place, I couldn't find any forum!
>>
>> I work on a project for local languages in Togo, we're working on 25
>> languages there (en.globalbility.org). We want to make a new keyboard that
>> are based on the French one that include symbols for local languages. I've
>> tested with the XKB system inside Ubuntu and made a first test keybord
>> (really not final version, just to test it).
>>
>> I have problem as there are several of the languages use tones on their
>> symbols like acute accent ( ´ ), grave accent ( *`* ), circumflex ( *ˆ *)
>> tilde ( *~ *) and macron ( ¯ ) on top of them.
>>
>> Here are the extra symbols I've added to the keyboard:
>> ʒ ɛ ǝ ƴ ʋ ɩ ɔ ʊ ɗ ɖ ƒ ɣ ɦ ɲ ɓ ŋ
>>
>> It should be possible to write them like this (examples from texts we have
>> produced):
>> ί ɔ́ ɛ́
>> ɛ̀ ɔ̀
>> ɔ̃ ɛ̃ ɔ̃ ĩ
>> ɛ̄
>> î
>>
>> How can I modify these kinds of combination symbols?
>>
>> I try to click AltGr+` and after AltGr+ɛ to get ɛ̀, but then noting
>> happens, and after I only get ɛ (without grave accent). Why? Is it because
>> the keyboard missunderstand when I click AltGr two times? How can I solve
>> this?
>>
>> Here is the small file I made to test it:
>>
>> default  partial alphanumeric_keys
>> xkb_symbols "basic" {
>>
>>      // First we include the whole French keyboard
>>      include "fr"
>>
>>      // We give our new keyboard a name
>>      name[Group1]="French (Togo)";
>>
>>      // Then we start to change the keys on the French keyboard.
>>      // Each key have a unique number.
>>      // Each key have 4 values: default, shift, altgr, shift+altgr  (altgr
>> is the right side 'alt' key)
>>
>>      // First row
>>
>>
>>      // Second row
>>      key <AD02>    { [    z,    Z,    U0292,    U01B7 ] };    // U0292 = ʒ
>> (small), U01B7 = Ʒ (capital)
>>      key <AD03>    { [    e,    E,    U025B,    U0190 ]    };    // U025B =
>> ɛ (small), U0190 = Ɛ (capital)
>>      key <AD04>    { [    r,    R,    U01DD,    U018E ] };    // U01DD = ǝ
>> (small), U018E = Ǝ (capital)
>>      key <AD06>    { [    y,    Y,    U01B4,    U01B3 ] };    // U01B4 = ƴ
>> (small), U01B3 = Ƴ (capital)
>>      key <AD07>    { [    u,    U,    U028B,    U01B2 ] };    // U028B = ʋ
>> (small), U01B2 = Ʋ (capital)
>>      key <AD08>    { [    i,    I,    U0269,    U0196 ] };    // U0269 = ɩ
>> (small), U0196 = Ɩ (capital)
>>      key <AD09>    { [    o,    O,    U0254,    U0186 ] };    // U0254 = ɔ
>> (small), U0186 = Ɔ (capital)
>>
>>      // Third row
>>      key <AC01>    { [    q,    Q,    U028A,    U01B1 ] };    // U028A = ʊ
>> (small), U01B1 = Ʊ (capital)
>>      key <AC02>    { [    s,    S,    U0257,    U018A ] };    // U0257 = ɗ
>> (small), U0189 = Ɗ (capital)
>>      key <AC03>    { [    d,    D,    U0256,    U0189 ] };    // U0256 = ɖ
>> (small), U0189 = Ɖ (capital)
>>      key <AC04>    { [    f,    F,    U0192,    U0191 ] };    // U0192 = ƒ
>> (small), U0191 = Ƒ (capital)
>>      key <AC05>    { [    g,    G,    U0263,    U0194 ] };    // U0263 = ɣ
>> (small), U0194 = Ɣ (capital)
>>      key <AC06>    { [    h,    H,    U0266,    U0124 ] };    // U0266 = ɦ
>> (small), U0124 = Ĥ (capital)
>>      key <AC10>    { [    m,    M,    U0272,    U019D ] };    // U0272 = ɲ
>> (small), U019D = Ɲ (capital)
>>
>>
>>      // Fourth row
>>      key <AB05>  { [    b,    B,    U0253,    U0181 ] };    // U0253 = ɓ
>> (small), U0181 = Ɓ (capital)
>>      key <AB06>    { [    n,    N,    U014B,    U014A ] };    // U014B = ŋ
>> (small), U014A = Ŋ (capital)
>>
>>
>> };
> I may well be wrong, but I think you need to add new "combinations"
> to XCompose - not just the "weird" things where the Compose key
> (Multi_key) is used for a new combination of letters in the same way
> that I can compose o e to œ, but also adding existing dead key accents
> to letters.  My experience is that any UTF-8 locale which does not
> have its own entry in /usr/share/X11/locale will use the
> en_US.UTF-8/Compose definitions.  I have my own keyboard maps
> ("just because I can"), here is a short extract to show what I mean.
> Unfortunately the lines are long.
>
>
> # First pull in the standard en_US.UTF-8 sequences
> include "%L"
>
> # descender on cedilla (right hook like ogonek supposedly preferred, but all fonts are left hook!
> <dead_cedilla> <Cyrillic_ZHE>					: "Җ"	U496	# ZHE with descender
> <dead_cedilla> <Cyrillic_zhe>					: "җ"	U497	# zhe with descender
> <dead_cedilla> <Cyrillic_ZE>					: "Ҙ"	U498	# ZE with descender
> <dead_cedilla> <Cyrillic_ze>					: "ҙ"	U499	# ze with descender
>
> <dead_greek> <dead_acute> <A>          : "Ά" U0386 # ALPHA with tonos
> <dead_greek> <dead_acute> <a>           : "ά" U03AC # alpha with tonos
>
>
> No, you won't have a dead_greek key in your map, it's just an
> example.  I use urxvt, so it is easy for me to type the numbers to
> get the representations in the strings such as Җ and Ά.
>
> However, there are some limitations to this -
>
> 1. If the accented letter is not available as a pre-composed letter,
> you need to use a combining accent.  I'm not sure if XCompose can
> translate dead_accent some_letter to combining_accent some_letter,
> but if not you could generate combining accents in the keymap, and
> then compose combining_acute <A> to pre-composed A with acute.
>
> 2. For gtk apps, the Compose tables were set into stone years ago
> and they will not recognise recent changes, let alone local
> additions.  In my .xinitrc or .xsession I use
>
> export GTK_IM_MODULE="xim"
>
> 3. I have not recently examined how well, or badly, this works in
> kde (I don't have _much_ use for my additions).
>
>
> I recommend that you begin by looking at the en_US Compose file to
> get a feeling for how things are done.  Don't be fooled if you can't
> see anything at first, there are a lot of blank lines at the start
> of that file.
>
> Good luck!
>
> ĸen