I find this discussion on time zone names interesting as well as the example of the country code qualifier. It highlights a large issue, namely that of unambiguous identification and referencing of "time zones" and in an IT-enabled manner. In addition, it is likely that e-commerce will make increasing use of date/time stamps in business transactions(e.g. closing date/time for bid is XXX local time, or an identified/referenced time zone) and the same in relation to digital signatures, certificates, key management, etc. Attached is a Canadian contribution to ISO standardization work of relevance to this discussion and relates issues. Normally, I would have provided the URL but access to the ISO/IEC JTC1/SC32 web site requires User ID and Password. Thus I attach a copy as a Word doc. Let me summarize three key points: 1. In standardization work in relation of Open-edi, e-commerce, geomatics, etc., one is becoming increasingly aware of the need for unambiguous, unique and linguistically neutral identification and referencing of "objects". As found in ISO 1087 "name" = "designation of an object by a linguistic expression". Consequently, any object will (1) have multiple names and (2) many of the "names" used to designate a real worl "object" will be in the form of linguistic expresions which use -non-Latin characters (e.g. Arabic, Chinese, Hebrew, Japanese, Cyrillic, etc.) This is one reason why ISO/IEC 106464 (a.k.a.) will become a key IT-infrastructure standard. Thus "names" can be and are used to designate an object, they are not that useful for unambiguous identification since often a name, i.e. a character string (or alpha/neric string), is not unique. Here "Paris" provides an example of several different real world objects having the same name.[ A problem known as "polysemy"] We can add qualifiers and will likely do but here I suggest cross-sectorial approach for refrencing rather that each area doing its own thing. 2. It is my understanding that the problem of multiple linguistic expression for the same geographic place name( e.g. London, Londres, Köln, Keulen, Cologne, Germany/Deutschland/Allemagne, etc.) is being addressed by topnomy experts world-wide through the UN (I need to check this out but maybe one of you knows more about this). It is my understanding that in the end there will only be one single official name for each place in each country in the language of that country as decided by the toponomy commission in each country. Where the language of that country uses a non-latin alphabet character set, that country is also expected to provide one single Latin-alphabet based equivalent name. In Canada, which is a biligual country (English/French), existing bilingual place names are disappearing in favour of uniligual ones, i.e. English or French (e.g. Trois Rivières/Three Rivers is now only Trois Rivières). Further in Canada (and likely inother countries as well), each place name, whether point, area/polygons or linear (e.g. rivers), is assigned a single lat/long coordinate for referencing purposes. (For larger cities, lakes, provinces, etc. this is an "abitrary" point, for river is is a lat/long coordinate take from the headwater, etc. Where map surrounds/polygons delienate the boundaries, these can then be referenced. In conclusion there is/should be a linkage here to the manner in which one references time zones. (See further the National Atlas of Canada and its Canadian Geographic Names Data Base) 3. The ISO 3166 standard Parts 1 and 2 is not IT-enabled, the two letter alpha codes are not stable can and do change (The IANA and Internet domain names also has to address this problem but then underneath all the Internat names are the numeric codes used for the actual routing and addressing among the ISPs, IT-systems, etc.). The ISO 3166 the three digit numeric country code is the most stable and also unique. The two alpha country codes often gets mixed up with the two letter language codes and the three letter alpha country codes with the three letter currency codes. They are not the same and often not unqie. All these and related problems which human beings used to filter out or write special programmes for to handle are now coming out in spades via everyone's access and use of the Internet. A systematic and IT-enabled approach to resolve this increasing mess is needed. Added note: There is Paris the city and these is Paris the "departement", there is New York the city and there is New York to state. The problem is resolvable and may require less work than one would expect. Enough said. I am doing some work in this field from an edi and e-commerce perspective as well as that of the environment. If you think this is important, have some ideas, know of other areas where addressing this would be of use, etc. please let me know. Trust this is of some help Regards - Jake Knoppers P.S.I also agree that continent/citynames is not that useful. It neither provides the unambiguity nor uniques required for referencing purposes. A better solution would be along the following lines: 1. Reference the ISO 3166 standard 2. Always start with a Level one country code 3. The a level 2 administrative subdivision code (e.g. state, province, lander, canton, etc) 4. The official name of the place according to the authoritative source, i.e. Wien and not Vienna or Vienne, Köln (or Koln) but not Keulen, Cologne) 5. Optional the associated lat/long coordinate as assigned by the authoritative source. 6. From a human interface needs and localization context, one would be free to provide different linguistic equivalent designations, i.e. "names" While such a solution requires a bit more data, it compensates by being useful to many other application areas as well as facilitating cross-sectorial electronic data interchange. ----------
From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> To: tz@elsie.nci.nih.gov Subject: Re: Time Zone Names Date: September 30, 1998 4:25 AM
Ken Pizzini wrote on 1998-09-30 00:22 UTC:
Markus Kuhn wrote:
As I have pointed out before, I don't like the continent prefix in the Olson/style names, therefore I do not want to make this particular syntax immortal in an ISO standard (I much prefer just ":Paris" over "Europe/Paris").
Interesting example. Is ``:Paris'' "Paris, France", or "Paris, Texas", or perhaps one of the other half-dozen or so cities named Paris?
Remember that time zone names refer to the most populated area within a region with common time zone history. This rule should already resolve practically all ambiguities. If there are really two Paris that are both candidates for TZ entries as they are both most populated areas in different time zone regions, then they should get qualifiers added (or at least all but the largest one should). This way, all those little Paris clones in the US are not of concern any more, and "Paris" would be guaranteed to refer to the real big one under the Eiffel's tower.
To remove any ambiguity, we have the coordinates of the place, and a GUI TZ selector tools can easily indicate on a map what region we are talking about.
I'm not particularly thrilled with the continent name either, but it does serve a purpose.
But not very well. How many Paris are there in the US alone?
An ISO 3166-1 country code or where necessary ISO 3166-2 country/region code for those hypothetical cases where there could occur an ambiguity would serve this purpose much better. The continent names come from the file organization of the Olson DB, and this implementation detail should IMHO not leak through to the name space. That's why I am not particular happy with seeing iCalendar people making these continent/city names more permanent by quoting them in their standards.
If I had to design proper tz names from scratch, they might look more like
fr.paris us.tx.paris
Markus
-- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
From: "Canaglobe International Inc." <mpereira@istar.ca> Date: Fri, 2 Oct 1998 12:49:40 -0400 3. The ISO 3166 standard Parts 1 and 2 is not IT-enabled, the two letter alpha codes are not stable can and do change (The IANA and Internet domain names also has to address this problem but then underneath all the Internat names are the numeric codes used for the actual routing and addressing among the ISPs, IT-systems, etc.). The ISO 3166 the three digit numeric country code is the most stable and also unique. True, but the 2-letter country codes are _much_ better known (especially now that they're part of Internet domain names), and they are much more common in most application areas served by POSIX and the ISO C standard. If you're going to use an ISO-based approach for locations, then the 2-letter codes are clearly the way to go. The 3-digit codes are not stable either, as locations change hands (the breakup of the Soviet Union being a recent example of wholesale reassignments). Since complete stability is impossible, we have to judge whether the 3-digit codes' slight increase in stability compensates for the increased number of user (and/or programmer) errors that are inevitable with numerical codes. In my view, the tradeoff is decisively in favor of the familiar alphabetic codes. P.S.I also agree that continent/citynames is not that useful. It neither provides the unambiguity nor uniques required for referencing purposes. Actually, he current setup is unambiguous and the names are unique. This is because there is a central registry of names. You are arguing for the substitution of a different central registry, one that is blessed by the ISO. Obviously this substitution will also work, and it might be more politically acceptable; but it is not strictly necessary to obtain uniqueness. 1. Reference the ISO 3166 standard 2. Always start with a Level one country code 3. The a level 2 administrative subdivision code (e.g. state, province, lander, canton, etc) Surely (3) is optional. Pitcairn doesn't have provinces. I assume that (3) is in US-ASCII. 4. The official name of the place according to the authoritative source, i.e. Wien and not Vienna or Vienne For interoperability reasons, names should be encoded in US-ASCII. Perhaps the standard could be extended to UTF-8 later, but I'm not sure the world is ready for UTF-8 here; and besides, even UTF-8 can't handle some place names. 5. Optional the associated lat/long coordinate as assigned by the authoritative source. In practice, there are many place names (even in the current tz database) for which there is no authoritative source. What is the authoritative source for the name of the South Pole Station, for example? (The people who live in the station use different names at different times, and would view a request for a canonical name with some amusement.) It will take the bureaucrats some time to catch up to reality; in the meantime we need something that works now. Therefore, (4) should be optional, and (5) should be a best guess in case there's no authority. 6. From a human interface needs and localization context, one would be free to provide different linguistic equivalent designations, i.e. "names" I suggest that there be a standard way for one of these designations to be an Olson-style name if one exists. This will encourage compatibility with existing practice, and will accommodate a gradual transition to the new order of things.
Paul Eggert said:
3. The ISO 3166 standard Parts 1 and 2 is not IT-enabled, the two letter alpha codes are not stable can and do change (The IANA and Internet domain names also has to address this problem but then underneath all the Internat names are the numeric codes used for the actual routing and addressing among the ISPs, IT-systems, etc.). The ISO 3166 the three digit numeric country code is the most stable and also unique.
The 3-digit codes are not stable either, as locations change hands (the breakup of the Soviet Union being a recent example of wholesale reassignments).
Not just that - a 3-digit code will change if there's a territorial adjustment to the country. The most obvious case is that DE changed numbers at Reunification, but there have been plenty of more subtle ones.
In my view, the tradeoff is decisively in favor of the familiar alphabetic codes.
Agreed. -- Clive D.W. Feather | Work: <clive@linx.org> | Tel: +44 1733 705000 Regulation Officer | or: <clive@demon.net> | or: +44 973 377646 London Internet Exchange | Home: <clive@davros.org> | Fax: +44 1733 353929 (on secondment from Demon Internet)
Paul Eggert wrote on 1998-10-02 20:51 UTC:
True, but the 2-letter country codes are _much_ better known (especially now that they're part of Internet domain names), and they are much more common in most application areas served by POSIX and the ISO C standard. If you're going to use an ISO-based approach for locations, then the 2-letter codes are clearly the way to go.
The 3-digit codes are not stable either, as locations change hands (the breakup of the Soviet Union being a recent example of wholesale reassignments). Since complete stability is impossible, we have to judge whether the 3-digit codes' slight increase in stability compensates for the increased number of user (and/or programmer) errors that are inevitable with numerical codes. In my view, the tradeoff is decisively in favor of the familiar alphabetic codes.
The numeric and alpha codes have very different purposes: the alpha code identifies more or less the name of a country, while the numeric code identifies its territory. Identification of the territory is highly relevant for statistical applications, because per-country statistics become incomparable if the territory changes (see German reunification as a good example). The ISO 3166 numeric codes are just the codes used by the UN statistics office. Identification of the territory of countries might actually be slighly more relevant to tz applications than identifications of names of countries (see also the hacks related to "mainland France"). Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
participants (4)
-
Canaglobe International Inc. -
Clive D.W. Feather -
Markus Kuhn -
Paul Eggert