"They have a different purpose than what's been discussed here."


Mark

— Il meglio è l’inimico del bene —



On Fri, Jun 8, 2012 at 12:31 PM, Ian Abbott <abbotti@mev.co.uk> wrote:
On 2012-06-08 18:10, Mark Davis ☕ wrote:
Just FYI.

In Unicode CLDR we needed to define a set of abbreviations for timezone
IDs. They have a different purpose than what's been discussed here; they
are purely internal ids, and only required because of restrictions in
BCP47 (so they all needed to be sequences of 3 to 8 ASCII alphanumerics
- case not significant).

What we did was use the United Nations LOCODE values whenever available,
which are all 5 characters long and start with the country code. When
there wasn't one available, we used values that were not of length 5 so
that they wouldn't collide with future values. So America/Los_Angeles
gets "uslax", while Etc/GMT-1 gets "utce01".

http://unicode.org/repos/cldr/tags/release-21-0-2/common/bcp47/timezone.xml

Those are abbreviations for the zone names, but a particular zone may need different abbreviations at different times, depending on daylight savings.  We probably don't want more than 6 characters for the abbreviations, which is the SUSv3 value of {_POSIX_TZNAME_MAX}.

Incidentally, Microsoft use the older value 3 for _POSIX_TZNAME_MAX and the value 10 for TZNAME_MAX, but their tzname[] values are typically longer than that, e.g. "GMT Standard Time" and "GMT Daylight Time". They don't make for easy parsing either!


--
-=( Ian Abbott @ MEV Ltd.    E-mail: <abbotti@mev.co.uk>        )=-
-=( Tel: +44 (0)161 477 1898   FAX: +44 (0)161 718 3587         )=-