Re: [tz] Why did you rename Russian zone name abbreviations
On 11/02/2016 08:50 AM, turchanov@vl.ru wrote:
... having "date" command output "Wed Nov 2 09:54:58 +10 2016" make it very difficult to understand where the system is.
That's fine, as the "date" command is supposed to be about *when*, not about *where*.
2. Since the inception of tzdata Olson stated rules for abbreviated names clearly and unambiguously (https://github.com/eggert/tz/blob/master/Theory):
That old guideline has been deprecated, as mentioned in the Theory file.
3. I have an impression that you (un-)intentionally confuse an abbreviated name with a GMT offset.
No, the idea is that if there is no common English-language abbreviation for a time zone, tzdata uses a numeric abbreviation instead.
if one needed a GMT offset it would have used %z-family flags.
The problem is not that software needs a GMT offset. It's that software unwisely requires a time zone abbreviation even when no such abbreviation exists in common English-language practice. Although we could simply leave the abbreviation empty in this case, it's more useful to fill it with a placeholder that conveys information to users of applications that unwisely output the abbreviation as-is.
By changing the abbreviated zone name you in effect change the original full zone name. Why didn't change Asia/Vladivostok to Asia/+10 as well?!
That wouldn't work, as Vladivostok was at +11 before 2014, and at +09 before 2011. Even in the US, where alphabetic time zone abbreviations are common, we wouldn't link America/Indiana/Petersburg to US/Eastern, because Petersburg observed Central Time before 2007.
5. You justify your decision to remove abbreviated names by declaring them as an "invention".
Yes. I invented these abbreviations. I didn't use any formal procedure. The (now-deprecated) guideline about them in the Theory file was written after the fact, partly in an attempt to justify them. When I invented those abbreviations, POSIX supported only alphabetic-only abbreviations. Also, I was naive about Russia: I thought that it kept time zones like the US does and that US-style alphabetic abbreviations therefore made sense for it. Nowadays, though, POSIX allows numeric abbreviations, and I know more about Russian practice. From an English-speaking point of view, abbreviations like VLAT are misleading because they imply that there is a time zone at a fixed offset from UTC called "VLAT", which is how abbreviations like "PST", "GMT", and "CET" work. However, abbreviations like "VLAT" do not correspond to fixed UTC offsets, and this is something that English-speakers are typically not expecting or accustomed to. So even if "VLAT" were widely used by Russian speakers, it would still be dubious as an abbreviation in English-language tzdata. I agree that the current situation could be improved, and that there could be a better way to localize time zone abbreviations. I expect this is an effort that should be under the CLDR umbrella, as they're the localization gurus. It may be that we need a better API for getting at localized abbreviations. Certainly the tzcode strftime.c is incomplete here, as it doesn't even support Russian month names, much less genitive-case Russian dates (does CLDR handle that correctly? if not, it should...).
On Nov 2, 2016, at 1:17 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
Certainly the tzcode strftime.c is incomplete here, as it doesn't even support Russian month names, much less genitive-case Russian dates (does CLDR handle that correctly? if not, it should...).
The CLDR has: <calendar type="gregorian"> <months> <monthContext type="format"> ... <monthWidth type="wide"> <month type="1">января</month> <month type="2">февраля</month> <month type="3">марта</month> <month type="4">апреля</month> <month type="5">мая</month> <month type="6">июня</month> <month type="7">июля</month> <month type="8">августа</month> <month type="9">сентября</month> <month type="10">октября</month> <month type="11">ноября</month> <month type="12">декабря</month> </monthWidth> </monthContext> <monthContext type="stand-alone"> ... <monthWidth type="wide"> <month type="1">январь</month> <month type="2">февраль</month> <month type="3">март</month> <month type="4">апрель</month> <month type="5">май</month> <month type="6">июнь</month> <month type="7">июль</month> <month type="8">август</month> <month type="9">сентябрь</month> <month type="10">октябрь</month> <month type="11">ноябрь</month> <month type="12">декабрь</month> </monthWidth> </monthContext> </months> so they appear to handle that by using the genitive in the "format" context and the nominative in the "stand-alone" context.
|The CLDR has: | | <calendar type="gregorian"> There is also CLDR-as-JSON now, on, well, at least Github[1]. Officially that is! Thanks to the Unicode people for that, even though it doesn't seem i will ever need that as a programmer, i once asked them for JSON, and now there is. E.g., "dates": { "calendars": { "gregorian": { "months": { "format": { "abbreviated": { "1": "янв.", "2": "февр.", "3": "мар.", "4": "апр.", "5": "мая", "6": "июн.", "7": "июл.", "8": "авг.", "9": "сент.", "10": "окт.", "11": "нояб.", "12": "дек." }, [1] https://github.com/unicode-cldr/cldr-dates-full/tree/master/main --steffen
On Wed, Nov 2, 2016 at 4:17 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
However, abbreviations like "VLAT" do not correspond to fixed UTC offsets, and this is something that English-speakers are typically not expecting or accustomed to.
There may be exceptions historically, but as a rule Russian timezones are defined as fixed offsets from Moscow, so when Moscow Time changes, so do all the timezones. So VLAT stays at MSK+7 even though it may not have always been UTC+10. For an average user in Russia, the meaning of MSK+7 and VLAT is fairly clear, but +10 does not make much sense. Since you kept MSK and users are expected to know that historically it has been at different UTC offsets, it does not make sense to reject zones that are at a fixed offset from it.
Yes. I invented these abbreviations. I didn't use any formal procedure. The (now-deprecated) guideline ...
On a lighter note, it reminded me of this scene: [image: Inline image 1]
On 11/02/2016 01:58 PM, Alexander Belopolsky wrote:
There may be exceptions historically, but as a rule Russian timezones are defined as fixed offsets from Moscow, so when Moscow Time changes, so do all the timezones.
This depends on what one means by "historical". Of course if one goes back before 1930 things get weird by modern standards. However, even recently, Vladivostok time has not been at a fixed offset from Moscow time. For example, in 2014 (the last time Russian clocks changed in a big way) Vladivostok changed seven hours earlier than Moscow did. Given this typical practice, we cannot simply use "MSK+07" as the abbreviation for Vladivostok time, as this abbreviation would be incorrect for several hours whenever Russians change their clocks. Another practical objection to "MSK+07" is that it would likely confuse users into setting the POSIX TZ environment variable to "MSK+07", which would not work as desired as it would use US Mountain Standard Time and call it "MSK". Although abbreviations like "VLAT" avoid these problems, they run into other issues. Because "VLAT" is not a fixed offset from UTC it departs from the usual English-language semantics for time zone abbreviations, misleading English-language readers. Also, what do we do with locations like Europe/Barnaul, which switched from +06 to +07 in March of this year? Should Barnaul use the abbreviation "OMST" (Omsk time) before March, and "NOVT" (Novosibirsk time) after? Or should it now use "BART" for all dates? Neither alternative is satisfactory and neither reflects common practice in English. In contrast, using numeric abbreviations is simple and requires no arbitrary inventions on our part.
On Thu, Nov 3, 2016 at 11:12 AM, Paul Eggert <eggert@cs.ucla.edu> wrote:
For example, in 2014 (the last time Russian clocks changed in a big way) Vladivostok changed seven hours earlier than Moscow did. Given this typical practice, we cannot simply use "MSK+07" as the abbreviation for Vladivostok time, as this abbreviation would be incorrect for several hours whenever Russians change their clocks.
Isn't this problem worse with the UTC offset? While Vladivostok may deviate from from MSK+07 for a few hours around an MSK transition, the UTC offset changes permanently. Still, I am not advocating for MSK+HH abbreviations. Use of the abbreviations should be deprecated and any change to them be considered gratuitous. I still believe a change from VLAT to +10 does not solve any real problem and it will inconvenience many users.
On 11/03/2016 08:27 AM, Alexander Belopolsky wrote:
Isn't this problem worse with the UTC offset?
Only if one assumes that a time zone abbreviation reflects the user's location, so that VLAT means "Vladivostok time, whatever that happens to be". In English the usual assumption is that a time zone abbreviation reflects the user's UTC offset, so that CST means "US Central Standard Time, 6 hours behind UTC", even if one's location is Mérida (which is not in the central US or even in central North America). A numeric abbreviation matches this usual assumption better than VLAT does.
A theme in everything Paul is saying is that time zone abbreviations are generally a bad idea and you ideally should just not use them. They were never particularly well-defined, and people always think they mean things other than they actually do. ISO 8601 recommends expressing times with just the offset from UTC and avoiding all of these abbreviations, including the new-style ones. Either date -R (for an RFC 2822 Date), date -Iseconds (for an ISO 8601 date and time) or any strftime string with %z will give you that offset. This is really how times should be expressed everywhere. These time zone abbreviations have to be provided because POSIX requires the field exist for backward compatibility reasons, but as with two-digit years (also supported in numerous places by POSIX), they're best avoided completely. If you want to convey someone's physical location, using the time zone for this is ambiguous, problematic, and mostly doesn't work. There are numerous other ways of designating this, from simply stating the location to using actual coordinates. The time should not be doing double-duty. -- Russ Allbery (eagle@eyrie.org) <http://www.eyrie.org/~eagle/>
Date: Thu, 3 Nov 2016 08:47:25 -0700 From: Paul Eggert <eggert@cs.ucla.edu> Message-ID: <2aab18c5-9c7b-ce94-e5cb-41865705d7bd@cs.ucla.edu> | In English the usual assumption is that a time zone abbreviation | reflects the user's UTC offset, I'm not sure that's true, even in the US. When I get to watch live US sport, there are frequently (useless to me) ads for following programs, which are typically stated as to start at (something like) 9 ET / 8 CT where the time and abbreviation just means "9 o'clock in the eastern timezone, or 8 in the central timezone" (and I assume that either people in the western half of the US are getting a different telecast, or they simply don't matter...) The same abbreviation is used whether it is winter or summer, as in reality, (almost) no-one cares what the offset from UTC might happen to be, just what the clock should show, locally, if you happen to want to watch. I'm also still 100% happy to keep using EST for Eastern Standard Time (+10) in the 4 eastern states of Australia during winter (and in Queensland in summer), and EST for Eastern Summer time (+11) in the other 3 eastern states in summer - and this was true when Tasmania shifted to summer time earlier than Vic and NSW. That said, best would be to delete all the abbreviations (or make them all the same, just "zzz" or something, if we fear breaking code if there is nothing there) - ALL of them (not excluding ones that some people believe are non-controversial, with only perhaps UTC excepted.) kre
On 11/03/2016 09:39 AM, Robert Elz wrote:
When I get to watch live US sport, there are frequently (useless to me) ads for following programs, which are typically stated as to start at (something like) 9 ET / 8 CT
Good point, and I imagine that Australian sports broadcasts do something similar. However, abbreviations like "ET" would have introduced ambiguity in the original 7th Edition Unix implementation that inspired tzdata, and I expect such abbreviations were avoided in Unix for that reason. tzdata has kept to the Unix tradition in its abbreviations. I just now looked for counterexamples, and all I found were typos in old Palestinian time stamps, which I will send out a proposed patch shortly for.
best would be to delete all the abbreviations
Sometimes I'm tempted to do that. There's quite a bit of practice dating back to the 1970s, though, and we're having plenty of trouble deleting just the invented abbreviations which are considerably more recent. Perhaps my successor can take on the larger task....
On Thu, Nov 3, 2016 at 11:47 AM, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 11/03/2016 08:27 AM, Alexander Belopolsky wrote:
Isn't this problem worse with the UTC offset?
Only if one assumes that a time zone abbreviation reflects the user's location, so that VLAT means "Vladivostok time, whatever that happens to be". In English the usual assumption is that a time zone abbreviation reflects the user's UTC offset, so that CST means "US Central Standard Time, 6 hours behind UTC", even if one's location is Mérida (which is not in the central US or even in central North America). A numeric abbreviation matches this usual assumption better than VLAT does.
Yes, I've been inconvenienced by the fact that MSK does not always mean +03 (or was it +04 or +02 back then?), but does the convenience of having an unambiguous many to one mapping of abbreviations to UTC offsets outweigh the problem of date command output from older systems being no longer parseable? I think the disruption could be minimized by postulating the current values and only changing the historical data where the postulated values are not correct. For VLAT it means postulating that VLAT=UTC+10 and changing VLAT to +11 for times prior to the last reform of 2014. In the good old times Russia (USSR) had a way to deal with varying UTC offsets. We had the notion of the "decree time". When Moscow transitioned from UTC+02 to UTC+03 in the early XX century, the new timezone was still called the 2nd timezone, but the new time was distinguished as Moscow Decree Time (and if anyone would come up with an abbreviation back then - it would probably be MDT or in the contemporary style MosDeTi :-). Unfortunately with a later introduction of the Summer Time (and briefly Double Summer Time) the term Decree Time became ambiguous. I remember that when Decree time was reintroduced in the 90s together with the Summer time, some people incorrectly referred to the Summer time as Decree time and for this reason the term Decree time was disfavored. In any case, while I agree that in theory the new scheme is somewhat better than the old one, changes like this that are not driven by the actual change in timekeeping practices should not be made without soliciting input from users in the affected regions first.
participants (6)
-
Alexander Belopolsky -
Guy Harris -
Paul Eggert -
Robert Elz -
Russ Allbery -
Steffen Nurpmeso