All,
it might be useful if the UA community where
more aware of the continuing efforts to request a disunification
of Assamese from Bengali on a script basis. The forwarded
message contains a link to the proposal so you can read along.
(The proposal calls for the "inclusion of the Assamese script"
in Unicode/ISO 10646 which sidesteps the fact that the script currently
encoded in Unicode, despite the name "Bengali" is in fact
covering both the Bengali and Assamese languages).
As noted by the person commenting on it on the Unicode list, the issues cited are all common for cases where more than one language shares the same script.
If you look in the proposal document, you will see that the list of characters are not really distinct from each other; so it's not the case that the languages use different forms of the same basic letters (unlike the European languages when historically German would use other letter shapes than French).
The consequences of splitting the script for IDNs would be pretty drastic, as every single label would have to be a blocked variant of some label in the "other" script; worse would be the issue that users seeing a label in print would (in many cases) not be able to tell which script to use to enter it.
Even for regular text, the situation would
be chaotic, as you could type either language in either script
and it would largely "look" OK, but sort differently. Add to
that, the issue that decades of existing data will continue to
exist in what would then be the "wrong" script.
What is of concern here is not that there is
a high likelihood of Unicode accepting a proposal like that, but
the level of activity in the community being geared up in
support of it.
Let's hope it doesn't ever get there,
A./
| Subject: | L2/18-181 |
|---|---|
| Date: | Wed, 16 May 2018 13:46:22 -0700 |
| From: | Doug Ewell via Unicode <unicode@unicode.org> |
| Reply-To: | Doug Ewell <doug@ewellic.org> |
| To: | Unicode Mailing List <unicode@unicode.org> |
http://www.unicode.org/L2/L2018/18181-n4947-assamese.pdf
This is a fascinating proposal to disunify the Assamese script from Bengali on the following bases: 1. The identity of Assamese as a script distinct from Bengali is in jeopardy. 2. Collation is different between the Assamese and Bengali languages, and code point order should reflect collation order. 3. Keyboard design is more difficult because consonants like ক্ষ are encoded as conjunct forms instead of atomic characters. 4. The use of a single encoded script to write two languages forces users to use language identifiers to identify the language. 5. Transliteration of Assamese into a different script is problematic because letters have different phonological value in Assamese and Bengali. It will be interesting to see where this proposal goes. Given that all or most of these issues can be claimed for English, French, German, Spanish, and hundreds of other languages written in the Latin script, if the Assamese proposal is approved we can expect similar disunification of the Latin script into language-specific alphabets in the future. -- Doug Ewell | Thornton, CO, US | ewellic.org