Re: [tz] Proposal for a modern 'Collapsed' Namespace

Oct. 12, 2012

      ...
|  - The much referenced issue of 1.4 billion+ people on Beijing time
  | being semantically mismatched to their timezone entry, which for lack
  | of meaningful alternative timezone identification strings winds up
  | getting displayed to users.
The problem of user selection (and localised identification) of time
zones is a real one, and worth working on - though personally I'd
prefer that you set up a new project for this, and get the involvement
of people experienced in international UI issues, rather than people
who know about time and (perhaps) care about little else...
That's an understandable take, however consider the other side:
 - ICANN's mandate is global and whilst its operating language may
very well be English, and tz has hobbled-along on a broken identifier
scheme for some time, it seems somewhat difficult to dismiss i18n
requirements, especially those affecting some huge percentage of
humanity
 - Because this is the de-facto library for so many systems, it seems
a really good place to solve common issues encountered when building
real systems dealing with this data.  Some of those issues that tz is
failing to solve right now are:
     - Grouping of historical timezones in to single logical entities
     - Timezone (or timezone group) names
     - i18n of the above
 - The tz data set has already overstepped the raw tz data purpose and
branched out in to providing useful, (arguably less) closely related
information such as associated lat/long and city names.  In doing so,
it has broadened its scope beyond raw tz data to closely tz-related
data that is useful in implementing tz data based systems. Basically,
tz is a database, and the name of the tz's themselves should be a core
feature of that database.
...
That is, this group might not be the best place to achieve a good
result for that worthwhile aim.
If a result were achieved here, however, it would be helpful to many
more people in the sense that it would be more likely to become
available to all related libraries and systems, and provide a common
point for internet-wide maintenance in the public interest.  Is this
not in line with ICANN's "mission of technical coordination"?*

* http://www.icann.org/en/about/welcome
...
|  - The lack of a functional entry for the widely used but unofficial
  | Xinjiang or Wulumuqi time of western China.
This one has been discussed before - my memory is poor, and I haven't
gone back through the archives to check, but I think the only real issue
was some doubt as to just how much those timezones are actually used.
If we get any good information that there's a timezone that is in use,
but we don't have, there's essentially never any problem adding it.
OK. I would be worried however that this would cause issues with
existing systems utilizing the database -- because of the fact that
the tz database has apparently not provided enough structure within
the zone data to clearly delineate between different time zones
simultaneously in use within the same geographic region.  It seems to
me that there is some kind of breakdown between cities as geographic
entities as principals for time zone affected regions (unsuitable for
presentation to the end user, but apparently sometimes used for wont
of alternative), the zone identifiers themselves (unsuitable for
presentation to the end user, but often used for wont of alternative),
and the actual time zone names as used by normal people, which are
apparently almost entirely missing!
...
|  - General accrual of crufty old timezones.
That's a mistake.  You're making assumptions about the way people use
the data that are not always correct.   Sure, if all the users ever care
about is "what is the time now" then zones that are different only wrt
times in the past seem superfluous - but when you need to look at a
historical timestamp, which people sometimes do, having the wrong zone
causes errors.
I see the use case and certainly don't mean to devalue in any way the
tremendous work that's gone in to compiling the tz resource. I just
think that on the weight of it, historic timezones that few people
have even heard of are a virtually academic edge-case with regards to
the 1.399 billion people that use tz data for normal computing
purposes in China and couldn't care less about 20th century regulatory
hiccups.  They don't have something that says "Beijing time", nor is
there even a means to link the five (!) disparate historic timezones
that may be useful for academics and specialists in to a single
timezone, which is the modern reality for 1.4 billion people.  They
simply can't be presented with an effective user interface, based upon
the tz data.

That's clearly a bug, any way you look at it.  As seen in an earlier
post on this thread, other zone lists have apparently taken some
initiative here. Why can't tz?

(In addition to China, it may be safe to assume that there are many
other areas of the world with now-unified timezones of purely historic
interest, presenting both translation overheads and a UI impediment to
non-academic developers and end users.)
...
What needs to be done is for the UI to better educate people and guide
them to the correct timezone selection for their needs, which is all
part of the UI issue, which I don't believe belongs here (let this project
collect the data, and someone else figure out how to present it, each
needs experts from their own fields, which are quite distinct.)
I'm not advocating the tz database care too much about UI.  I am
merely advocating that it provides the fundamental requirement for any
timezone related program - a human readable name for the time zone in
question.

Where the human readable name crosses multiple historic timezones,
some form of grouping such as that proposed (and apparently adopted
elsewhere) should also, quite necessarily, be provided.

Right now, people use the identifier (eg: Asia/Shanghai) despite
problems with its use for this purpose. That's because there's no
alternative provided except for the zone.tab comments, which are less
than uniformly suitable for presentation to (and translation for) end
users.

There should at least be a name. And if there's a name, in this day
and age, it should be multilingual.

Right now the tz dataset, whilst successful, apparently remains a
database of identifiers for entities that cannot be presented to end
users, for wont of human readable names.

Regards,
Walter