For any program that works with dates, time, or numbers ensure that the formatting is flexible to allow for cultural variations (3/5 being either May 3 or March 5 and 1,000.00 and 1.000,00 used for the same value in different cultures). One of my favorite examples regarding the complexity of writing code that can be used worldwide with UI strings is the formatting of number of items. English is an oddity in that we consider zero and 2 or more to be plural but 1 to be singular (1 item, 0 items, 2 items); that rule varies quite a bit across languages. This one is a great student exercise if you give them two or three language models to write around (English and French perhaps with singugular for zero and 1 but plural otherwise). An overall one is that all UI strings and images be kept as separate resources grouped by language and are referenced from the main code in such a way that no assumptions are made on language order. Ideally, no assumptions are made about reading direction between left-to-right and right-to-left but realistically many programs are developed only for American (N and S) and European markets. -----Original Message----- From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andre Schappo Sent: Thursday, June 29, 2017 7:43 AM To: ua-discuss@icann.org Subject: Re: [UA-discuss] Computer Science/ICT/IT Curricula Internationalization Thank you. That is just the sort of input needed for Computer Science/ICT/IT Curricula internationalization. The unfortunate reality is that currently the vast majority of students graduate knowing only ASCII text processing/storage/transformation/transmission because that is all they are taught. So, no chance of graduates understanding Unicode or normalization forms. André Schappo
On 29 Jun 2017, at 15:23, Andrew Sullivan <ajs@anvilwalrusden.com> wrote:
On Thu, Jun 29, 2017 at 02:06:31PM +0000, Andre Schappo wrote:
Which internationalization topics would you like covered in Computer Science/ICT/IT Curricula?
I think it would be really good if graduates had a clear idea of how Unicode worked, what the differences are between (e.g.) a character and a code point or sequence of them, what the properties are, how to access them, normalization, and so on. Even in this group we sometimes struggle because people forget that the Unicode properties are what determine a given code point, and have stumbled over normalization forms. It's amazing to me, for instance, that we have to keep telling people to normalize user-generated text input before storage. (A later-year student would get a failing grade if s/he didn't check input before blindly handing it to the database, and yet we don't have the same reaction when NF* isn't immediately used on the same input.)
Best regards,
A
-- Andrew Sullivan ajs@anvilwalrusden.com