-----Original Message----- From: Mark Davis [mailto:mark.davis@icu-project.org] Sent: Wednesday, December 07, 2005 1:05 PM To: Arthur David Olson Cc: tz@lecserver.nci.nih.gov Subject: Re: FW: Re: Timezone name translations Addressing the message: I would recommend using the TZIDs as is -- not try to invent new ones. You mention stability as being a concern: what we do in CLDR is to canonicalize the TZIDs for translation by choosing a particular one out of all the equivalent names (a choice we guarantee to be stable). Translating all of the TZIDs for all the languages is onerous, so for the zone identifiers in countries that only have a single zone, we leverage the country translations that we already have. We then can prioritize the translation of TZIDs that are in multizone countries. This information is then used in formatting the names, as per the following: http://www.unicode.org/draft/reports/tr35/tr35.html#Time_Zone_Fallback (This is the working draft of the next version, so modified text is in yellow). As an example of the data, see the translations in http://unicode.org/cldr/data/common/main/el.xml. Search for <territories> to see the translated country names. Search for <timeZoneNames> to see the translations for multizone countries (or particular other cases). Mark P.S. The abbreviations are ambiguous -- and not readily translated without introducing even further ambiguities -- so I would not recommend them for translation. Arthur David Olson wrote:
This message failed to make it out to the list.
--ado
From: Chuck Soper [chucks2@veladg.com]
At 8:01 PM +0700 12/3/05, Stuart Bishop wrote:
Paul Eggert wrote:
One problem with gettext format is that there might be multiple > translations for the same English-language abbreviation. For example,
"IST" is short for either Israel Standard Time or for India Standard
Time, and it's possible that the acronyms in (say) Russian would be different. Hence gettext("IST") might not work as the user would expect, in a Russian locale.
This is a problem even without localization. But I won't be tackling abbreviations anyway - it would involve first mapping and translating each historical period in each timezone to an English sentence to cope
with the duplicate abbreviations and the patalogical cases like Australian Eastern Standard Time and Australian Eastern Daylight Savings Time (still breaking code to this day).
I'm interested in the display of time zone names both in English and localized. It seems like trying to provide a time zone name for each tzID (std/dst) might be fairly laborious because there are almost 400 tzIDs (in the zone.tab file). I'm considering building a time zone name table based time on abbreviations and UTC offsets. Each row in the table could have a tz abbreviation, a UTC offset and a time zone name. For example, one row could contain CET, UTC+1 and 'Central European Standard Time'. There are 34 tzIDs that use CET at UTC+1 during some time of the year. Instead of trying to maintain 34 tz names for 34 tzIDs why not maintain one tz name for an abbreviation and a UTC offset? Another example is Argentina. Doesn't Argentina have two tz names for its 10 tzIDs? For localized tz names, additional tables would be created.
I understand that an abbreviation by itself is not unique (e.g. IST, EST, etc.), but the combination of an abbreviation and a UTC offset might be unique. Does anyone know if the abbreviation/UTC offset combination is unique? Clearly, tzIDs are unique, but they're not very stable and there are a lot of them.
I believe that the abbreviation/UTC offset combination actually holds more information than a tzID. For example, PWT/UTC-7 (during World War II) could display Pacific War Time. The tzID by itself does not contain that information. Also, using only the tzID might produce some unneeded names if names are created tzID/dst combinations that do not exist.
I'm interested to find out if people think this approach might be effective for building and maintaining a list of time zone names.
Chuck