| 1 |
Some notes and ramblings on Chinese translations (and the fun of
|
| 2 |
maintaining both Big5 and GB pages and hope that all the characters
|
| 3 |
show up properly. :-)
|
| 4 |
|
| 5 |
Note: This document may contain GB2312 code.
|
| 6 |
|
| 7 |
Content Negotiation:
|
| 8 |
~~~~~~~~~~~~~~~~~~~
|
| 9 |
lang charset
|
| 10 |
------ ------ -------
|
| 11 |
zh-CN .zh-cn Big5
|
| 12 |
zh-TW .zh-tw GB2312
|
| 13 |
|
| 14 |
|
| 15 |
Big5<->GB... Arrgghh!
|
| 16 |
|
| 17 |
Big5 is *bad*!! Its relationship to Unicode is _not_ one-to-one,
|
| 18 |
and is giving me a lot of headaches.
|
| 19 |
|
| 20 |
The following is from
|
| 21 |
ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/OTHER/BIG5.TXT
|
| 22 |
|
| 23 |
# WARNING! It is currently impossible to provide round-trip compatibility
|
| 24 |
# between BIG5 and Unicode.
|
| 25 |
#
|
| 26 |
# A number of characters are not currently mapped because
|
| 27 |
# of conflicts with other mappings. They are as follows:
|
| 28 |
#
|
| 29 |
# BIG5 Description Comments
|
| 30 |
#
|
| 31 |
# 0xA15A SPACING UNDERSCORE duplicates A1C4
|
| 32 |
# 0xA1C3 SPACING HEAVY OVERSCORE not in Unicode
|
| 33 |
# 0xA1C5 SPACING HEAVY UNDERSCORE not in Unicode
|
| 34 |
# 0xA1FE LT DIAG UP RIGHT TO LOW LEFT duplicates A2AC
|
| 35 |
# 0xA240 LT DIAG UP LEFT TO LOW RIGHT duplicates A2AD
|
| 36 |
# 0xA2CC HANGZHOU NUMERAL TEN conflicts with A451 mapping
|
| 37 |
# 0xA2CE HANGZHOU NUMERAL THIRTY conflicts with A4CA mapping
|
| 38 |
#
|
| 39 |
# We currently map all of these characters to U+FFFD REPLACEMENT CHARACTER
|
| 40 |
|
| 41 |
Another reference is the Big5+ standard tables. At least it won't
|
| 42 |
leave any Big5+ codes dangling. :-) It does include a Big5+ to GBK
|
| 43 |
table, but then, we want GB, not GBK. Hmm...
|
| 44 |
|
| 45 |
|
| 46 |
Converter
|
| 47 |
~~~~~~~~~
|
| 48 |
* Don't bother with tcs. Due to the traditional/simplified character
|
| 49 |
issue, tcs simply doesn't work well at all.
|
| 50 |
|
| 51 |
* utf-converter works but need more tweaking to get everything translated
|
| 52 |
properly.
|
| 53 |
|
| 54 |
|
| 55 |
-- Anthony Fok <foka@debian.org>, Fri, 16 Apr 1999 05:11:03 -0600
|