Stable fixed length identifiers for IANA time zones
Steven, Mark, I checked the latest timezone.xml contained in core.zip linked from http://cldr.unicode.org/index/bcp47-extension Are the values for the names that are not /[a-z]{5}/ , e.g. <type name="mxstis" ... > already permanently fixed? What happens if Santa Isabel gets an UN/LOCODE in the future? I am using UN/LOCODEs since at least 2003 http://en.wikipedia.org/w/index.php?title=UN/LOCODE&diff=prev&oldid=1414613 In my system the first two letters never map into a set of ISO 3166-1 countries (current and former) without one set member being or having been the country for the location, e.g. "jeruslm", the first letters, JE, stand for Jersey, which never was a country where Jerusalem was located. If there is "usnavajo" placing "navajo" below "us", then <type name="jeruslm" alias="Asia/Jerusalem Asia/Tel_Aviv Israel" description="Jerusalem"/> is inconsistent, since Asia/Jerusalem is used for Israel, which would lead to "iljerusalem" or so and instead of "hebron" and "gaza" it would be "pshebron" and "psgaza" Since UN/LOCODE doesn't use the numbers 0 and 1, I created private codes using "1" in third position, so for Santa Isabel I would use MX1SI or in lower case mx1si for Hebron PS1HB, Gaza PS1GZ That way the codes all can be of the same length, namely 5 characters. Tobias On Fri, May 25, 2012 at 5:10 PM, Steven R. Loomis <srl@icu-project.org> wrote:
On 05/24/2012 04:27 PM, Tobias Conradi wrote: ... E.g. CLDR has created UN/LOCODE based codes: http://cldr.unicode.org/development/development-process/design-proposals/bcp...
This a draft design document proposing what is now the 'tz' field of IETF BCP 47 Extension UÂ < https://tools.ietf.org/html/rfc6067> referencing UTS35 and data < http://unicode.org/reports/tr35/#Unicode_Language_and_Locale_Identifiers>. However, it is not relevant to simply the discussion of "." and "-", instead, the UN/LOCODE based codes (with some additions) were needed due to length restrictions.
-- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com/
Below is a complete mapping of identifiers from timezone.xml into a 5-char set of identifiers that produces strings that are distinct from UN/LOCODEs per http://www.unece.org/fileadmin/DAM/cefact/locode/unlocode_manual.pdf 3.2.1 "However, where all permutations available for a country have been exhausted, the numerals 2-9 may also be used." On Sat, May 26, 2012 at 9:59 PM, Tobias Conradi <tobias.conradi@gmail.com> wrote:
Steven, Mark,
I checked the latest timezone.xml contained in core.zip linked from http://cldr.unicode.org/index/bcp47-extension ... Since UN/LOCODE doesn't use the numbers 0 and 1, I created private codes using "1" in third position, so for Santa Isabel I would use
MX1SI or in lower case mx1si
for Hebron PS1HB, Gaza PS1GZ
That way the codes all can be of the same length, namely 5 characters.
The utc based codes could be converted to 5 char too, replacing utc with zz: utce01 -> zze01 utcw12 -> zzw12 UTC itself could be: utc -> zz000 Unkown could be: unk -> zzunk or zz1un The use of 0 and 1 ensure there is no clash with UN/LOCODEs. Here are some more possible mappings for identifiers that are not 5 char long: usndnsl -> usnqy (UN/LOCODE USNQY) usndcnt -> uszt8 (UN/LOCODE USZT8) Handmade codes using "1" and as of assignment using the correct ISO 3166-1 alpha-2 code: gaza -> ps1gz gldkshvn -> gl1dm hebron -> ps1hb jeruslm -> il1jr mxstis -> mx1si usnavajo -> us1nv usinvev -> us1vv That would leave only four US specific codes: cst6cdt est5edt mst7mdt pst8pdt In case they could be changed, they could be: us1c6 us1e5 us1m7 us1p8 -- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com/
participants (1)
-
Tobias Conradi