[PATCH] UTF-8 Compose table not symmetric

Nicolas George nicolas.george at normalesup.org
Fri Apr 25 05:54:23 PDT 2008


Hi.

The compose tables for the legacy encodings are mostly symmetric with regard
to Multi_key. For example, for ISO-8859-1, there are both:

<Multi_key> <e> <apostrophe>            : "\351"        eacute
<Multi_key> <apostrophe> <e>            : "\351"        eacute

The UTF-8 table, on the other hand, has only the second entry, and the
postfix keystroke, Multi_key-e-apostrophe, does not give anything.

There are people who are accustomed to the postfix keystrokes, and thus find
the transition difficult.

In the past, some versions of the UTF-8 compose table were symmetric, but
this changed as other versions were committed.

The attached patch adds the symmetric combinations. I also join the script I
used to generate them.

There is one conflict: minus-d yields LATIN SMALL LETTER D WITH STROKE while
d-minus yields DONG SIGN. This patch leaves it unchanged.

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: x11-compose-symmetric.diff.gz
Type: application/x-gunzip
Size: 32246 bytes
Desc: not available
URL: <http://lists.x.org/archives/xorg/attachments/20080425/9d5eb4e2/attachment.bin>
-------------- next part --------------
#!/usr/bin/perl

use strict;
use warnings;

my @lines = <>;

sub parse_line($) {
  my ($l) = @_;
  my ($keys, $result) = $l =~ /(<\w+>(?:\s<\w+>)+)(\s+: .*)/s
    or return undef;
  my @keys = split " ", $keys;
  return { k => \@keys, r => $result };
}

my %compose;

for my $l (@lines) {
  my $lp = parse_line $l or next;
  my @k = @{$lp->{k}};
  if(@k == 3 && $k[0] eq "<Multi_key>") {
    $compose{$k[1]}{$k[2]} = $lp->{r};
  }
}

for my $l (@lines) {
  print $l;
  my $lp = parse_line $l or next;
  my @k = @{$lp->{k}};
  if(@k == 3 && $k[0] eq "<Multi_key>") {
    my $k1 = $k[1];
    my $k2 = $k[2];
    next unless $k1 =~ /<[[:lower:]]\w+>/;
    my $cr = $compose{$k2}{$k1};
    my $r = $lp->{r};
    if(defined $cr) {
      my $scr = $cr;
      my $sr = $r;
      for ($scr, $sr) {
	s/.*#\s*//;
	s/\s*\z//;
      }
      warn "Conflict between $k1 $k2 -> $sr\n",
	"             and $k2 $k1 -> $scr\n\n"
	if $cr ne $r;
    } else {
      print "<Multi_key> $k2 $k1$r";
    }
  }
}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.x.org/archives/xorg/attachments/20080425/9d5eb4e2/attachment.pgp>


More information about the xorg mailing list