OpenJDK/CLDR/ICU/Joda issues with Ireland change
Let me try to summarize why the negative SAVE (Ireland change) matters to Java-based projects. there are two basic streams of code - old and new. The older OpenJDK time-zone code and ICU Java code both derive from the same original 20 year old source. The newer OpenJDK time-zone code derives indirectly from Joda-Time, which is still based on many principles from the older OpenJDK code. The ThreeTen-Backport project is an early version of the newer OpenJDK time-zone code. Thus, the vast majority of Java time-zone code is linked and follows a similar approach. The basic data stored is: - a "raw" offset that changes over time, also known as "standard" - a "daylight" offset that changes over time (either relative to the "raw" or absolute) - time-zone changes available from 1970 or earlier - only the *current* names of time-zones are available Some problems: 1) The older code has limits on the difference between the "raw" and "daylight" offset in `GregorianCalendar`. Specifically, the "DST_OFFSET" can only be from 0 to 2 hours. Negative SAVE values are not expected or supported. Note that the limits of 0 to 2 hours are publicly visible via the API `GregorianCalendar.getLeastMinimum()` and friends. 2) The current names of the time-zones is accessed via an array of a fixed order, roughly [long-std, short-std, long-daylight, short-daylight]. Whether the std or daylight text form should be returned is determined by a parameter - a boolean, named "daylight". The basic way this is determined is whether the offset and the"raw" offset are the same or not. This is also expressed in `TimeZone.inDaylightTime()` and `ZoneRules.isDaylightSavings()`, making any switch in the boolean user-visible. The problem here is that the Ireland change flips when the boolean is true from summer to winter. Whereas the content of the array has been stable for 20 years. In the problematic case I looked at, this means that the wrong textual description is returned. ie. in winter, the code has always accessed array elements 0 and 1, whereas if the boolean flag is swapped it will access elements 2 and 3. Remember that only the *current* names of the time-zones are available - so if the Ireland change happens, the time-zone name will necessarily be wrong either before the change occurs or after it. It is a similar problem as Yoshito Umaoka from CLDR expressed a clash between data and code. Here the data on names would clash with the boolean "daylight" flag. Fixing the code doesn't help in this scenario - there is no viable approach to fixing it that can work, as Java has a long history of backwards compatibility over decades, not a year (as is being talked about). Specifically, some of these libraries (Joda-Time, ThreeTen-Backport) run with new tzdb data on older JDK versions. It is the incompatibility in the boolean, where false no longer means winter that cannot be accepted. So, to summarize, even if those in the CLDR/ICU/Java world thought this change was meaningful and positive, backwards compatibility rules out making it. But, in fact the majority of the feedback on the topic from software library maintainers has been negative. I'm pretty sure that negative SAVE is going to be rejected by CLDR, ICU, OpenJDK, Joda-Time and ThreeTen-Backport permanently, and probably Android and Apple too based on their responses so far. As I've outlined, there is no way to meet the backwards compatibility requirements with it. A one year stay of execution won't make a difference in the practicality of the change. FWIW, I understand why some are attached to trying to express data in a "pure" form, but that was never the original intent of the project IMO. The project needs to get back to serving its downstream consumers who just want to know the time without continuous instability in the data format. Stephen PS, Microsoft .NET APIs also have an IsDaylightSavingTime() method: https://msdn.microsoft.com/en-us/library/bb460642(v=vs.110).aspx#Anchor_3
On 01/22/2018 09:19 AM, Stephen Colebourne wrote:
1) The older code has limits on the difference between the "raw" and "daylight" offset in `GregorianCalendar`. Specifically, the "DST_OFFSET" can only be from 0 to 2 hours. Negative SAVE values are not expected or supported.
By "older code" I assume you mean Java 8 and earlier, along with ICU Java code. Does newer Java code (Java 9, ThreeTen-Backport) also have these limitations? If so, this should probably be fixed in newer code regardless of the Irish time issue, as 0-2 hours is a pretty-restrictive limit and it's plausible that some government somewhere will go outside that window in the positive direction as well as in the negative.
The problem here is that the Ireland change flips when the boolean is true from summer to winter. Whereas the content of the array has been stable for 20 years. In the problematic case I looked at, this means that the wrong textual description is returned. ie. in winter, the code has always accessed array elements 0 and 1, whereas if the boolean flag is swapped it will access elements 2 and 3.
Why would that be the "wrong textual description"? If the data are changed so that Irish standard time is most of the year and winter time uses a negative DST offset, then [long-std, short-std, long-daylight, short-daylight] will be ["Irish Standard Time", "IST", "Greenwich Mean Time", "GMT"] and it will be correct to access elements 2 and 3 in winter.
Remember that only the *current* names of the time-zones are available - so if the Ireland change happens, the time-zone name will necessarily be wrong either before the change occurs or after it.
I'm not quite following as some details are omitted in your summary, but isn't it possible that the traditional behavior gets old Irish timestamps wrong that the proposed behavior will fix? If I understand you correctly then with the old approach (in 2017c, say), Java code mishandles Irish time stamps from 1968 to 1971 because they use Irish Standard Time (IST) and are at UTC+01 with tm_isdst==0, but Java calls them "Greenwich Mean Time" or "GMT" because their DST offset is zero. In contrast, the proposed behavior should cause new Java code to correctly call them "Irish Standard Time" or "IST".
I'm pretty sure that negative SAVE is going to be rejected by CLDR, ICU, OpenJDK, Joda-Time and ThreeTen-Backport permanently That's too bad, as there doesn't seem to be any significant technical reason (other than inertia) as to why Java code couldn't support negative DST in the future. This feature has been supported in tzcode and in other downstream implementations for decades, and it's easy to support.
That being said, we do need a way to support implementations that don't support negative DST as well as those that do, and I'll look into doing that after 2018c comes out.
On 22 January 2018 at 18:16, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 01/22/2018 09:19 AM, Stephen Colebourne wrote:
1) The older code has limits on the difference between the "raw" and "daylight" offset in `GregorianCalendar`. Specifically, the "DST_OFFSET" can only be from 0 to 2 hours. Negative SAVE values are not expected or supported.
By "older code" I assume you mean Java 8 and earlier, along with ICU Java code. Yes
Does newer Java code (Java 9, ThreeTen-Backport) also have these limitations? No. The biggest limit is that offsets are constrained to -18 to +18 hours.
In the problematic case I looked at, this means that the wrong textual description is returned. ie. in winter, the code has always accessed array elements 0 and 1, whereas if the boolean flag is swapped it will access elements 2 and 3.
Why would that be the "wrong textual description"? If the data are changed so that Irish standard time is most of the year and winter time uses a negative DST offset, then [long-std, short-std, long-daylight, short-daylight] will be ["Irish Standard Time", "IST", "Greenwich Mean Time", "GMT"] and it will be correct to access elements 2 and 3 in winter.
This will happen in the perfect scenario where everything is updated at once. The problem is that the perfect scenario is the anomaly. In many perfectly reasonable scenarios, one will be updated without the other being updated, causing nonsense output. Switching the meaning of a boolean flag is like doing a U-turn on a freeway. Fine if everyone else does the U-turn at the same instant as you, but if not you'll get flattened by the truck behind you. So, this simply isn't like zic and the OS sitting at the bottom of the stack. Those objecting live higher in the stack where different projects/data get updated at different speeds, and thus we really, really care about keeping things stable.
I'm pretty sure that negative SAVE is going to be rejected by CLDR, ICU, OpenJDK, Joda-Time and ThreeTen-Backport permanently
That's too bad, as there doesn't seem to be any significant technical reason (other than inertia) as to why Java code couldn't support negative DST in the future. This feature has been supported in tzcode and in other downstream implementations for decades, and it's easy to support.
That being said, we do need a way to support implementations that don't support negative DST as well as those that do, and I'll look into doing that after 2018c comes out.
To paraphrase your response, thats too bad, as there doesn't seem to be any good reason (other than a misguided notion of purity) as to why TZDB should be changed at all. Moreover, having the operating system (via zic) run on different rules to the applications that run it seems destined to end in tears. As per Arthur Olson's recent email - TZDB has grown far beyond the data and zic. There are many many consumers of the data collected here. Proceeding with this change, * in any form*, is unwise and unhealthy for the project given how much opposition there clearly is (and how little support). Stephen
On Mon 2018-01-22T18:47:07+0000 Stephen Colebourne hath writ:
Does newer Java code (Java 9, ThreeTen-Backport) also have these limitations? No. The biggest limit is that offsets are constrained to -18 to +18 hours.
The traditional calendar for observing at Lick Observatory has always had days begin at local noon. This means that the time zone for the Lick calendar is 20 hours behind Greenwich. I implore all implementors to accommodate local time zones which are not limited to the list that is part of tzdb. -- Steve Allen <sla@ucolick.org> WGS-84 (GPS) UCO/Lick Observatory--ISB 260 Natural Sciences II, Room 165 Lat +36.99855 1156 High Street Voice: +1 831 459 3046 Lng -122.06015 Santa Cruz, CA 95064 http://www.ucolick.org/~sla/ Hgt +250 m
On 01/22/2018 10:55 AM, Steve Allen wrote:
The traditional calendar for observing at Lick Observatory has always had days begin at local noon. This means that the time zone for the Lick calendar is 20 hours behind Greenwich.
You should be OK with tzcode or with any other POSIX-conforming implementation, since POSIX requires support for UTC offsets up to 25 hours. I don't offhand know what tzcode's limits are in this area; at some point integer overflow kicks in, if nothing else does.
On Jan 22, 2018, at 1:55 PM, Steve Allen <sla@ucolick.org> wrote:
On Mon 2018-01-22T18:47:07+0000 Stephen Colebourne hath writ:
Does newer Java code (Java 9, ThreeTen-Backport) also have these limitations? No. The biggest limit is that offsets are constrained to -18 to +18 hours.
The traditional calendar for observing at Lick Observatory has always had days begin at local noon. This means that the time zone for the Lick calendar is 20 hours behind Greenwich. I implore all implementors to accommodate local time zones which are not limited to the list that is part of tzdb.
There’s a proposal before the C++ standardization committee: https://wg21.link/p0355 <Disclaimer> I’m the author. </Disclaimer> which supports _multiple_ types of time zones. It only has one type that is supplied by the standard, and that is a wrapper around the IANA time zone database. However users can relatively easily supply alternative time zones and have them work relatively seamlessly with the std::lib. For example here is an example custom time zone based on posix time zones: https://github.com/HowardHinnant/date/blob/master/include/date/ptz.h My point is to affirm Steve Allen’s comment above, that at least in C++, we are aiming to allow for things such as the Lick calendar, even if only as user-written customizations. Howard
On 22/01/18 19:22, Howard Hinnant wrote:
There’s a proposal before the C++ standardization committee: https://wg21.link/p0355 <Disclaimer> I’m the author. </Disclaimer>
Still no means of identifying if a change of version of TZ data results in a change to a local time? At the end of the day which ever software is being used, if I provide a UTC normalised time then it also needs to identify the rule set used to create it. If different software uses different ways of storing the rules it is somewhat irrelevant as long as they all produce the same local time so any improvements to the rules should include version information. Something that has been missing for too long now. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On Jan 22, 2018, at 2:44 PM, Lester Caine <lester@lsces.co.uk> wrote:
On 22/01/18 19:22, Howard Hinnant wrote:
There’s a proposal before the C++ standardization committee: https://wg21.link/p0355 <Disclaimer> I’m the author. </Disclaimer>
Still no means of identifying if a change of version of TZ data results in a change to a local time? At the end of the day which ever software is being used, if I provide a UTC normalised time then it also needs to identify the rule set used to create it. If different software uses different ways of storing the rules it is somewhat irrelevant as long as they all produce the same local time so any improvements to the rules should include version information. Something that has been missing for too long now.
#include "date/tz.h" #include <iostream> int main() { std::cout << date::get_tzdb().version << '\n'; } Output: 2018b :-) Howard
On 1/22/18 14:44, Lester Caine wrote:
On 22/01/18 19:22, Howard Hinnant wrote:
There’s a proposal before the C++ standardization committee: https://wg21.link/p0355 <Disclaimer> I’m the author. </Disclaimer>
Still no means of identifying if a change of version of TZ data results in a change to a local time? At the end of the day which ever software is being used, if I provide a UTC normalised time then it also needs to identify the rule set used to create it. If different software uses different ways of storing the rules it is somewhat irrelevant as long as they all produce the same local time so any improvements to the rules should include version information. Something that has been missing for too long now. We can make a start by adding VERSION as a valid property to the iCalendar format and populating it from the data.
However, I'd like to see a more fine-grained value than the current release. We probably also need a SOURCE property also
On 22/01/18 20:00, Michael Douglass wrote:
On 1/22/18 14:44, Lester Caine wrote:
On 22/01/18 19:22, Howard Hinnant wrote:
There’s a proposal before the C++ standardization committee: https://wg21.link/p0355 <Disclaimer> I’m the author. </Disclaimer>
Still no means of identifying if a change of version of TZ data results in a change to a local time? At the end of the day which ever software is being used, if I provide a UTC normalised time then it also needs to identify the rule set used to create it. If different software uses different ways of storing the rules it is somewhat irrelevant as long as they all produce the same local time so any improvements to the rules should include version information. Something that has been missing for too long now. We can make a start by adding VERSION as a valid property to the iCalendar format and populating it from the data.
However, I'd like to see a more fine-grained value than the current release. We probably also need a SOURCE property also
tzdist covers most of the points on identifying the rule sets being used ... and where the current offset differs from that stored previously. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On 2018-01-22 18:55, Steve Allen wrote:
The traditional calendar for observing at Lick Observatory has always had days begin at local noon. This means that the time zone for the Lick calendar is 20 hours behind Greenwich. I implore all implementors to accommodate local time zones which are not limited to the list that is part of tzdb.
One could just use TZ='LickT20'. The extremal offsets for standard POSIX TZ strings are +-24:59:59 h. Michael Deckers.
On Mon, Jan 22, 2018, at 13:55, Steve Allen wrote:
The traditional calendar for observing at Lick Observatory has always had days begin at local noon. This means that the time zone for the Lick calendar is 20 hours behind Greenwich.
Wouldn't that make local noon midnight? It seems like a timezone is not the right level of abstraction to solve this problem.
On Jan 22, 2018, at 2:42 PM, Random832 <random832@fastmail.com> wrote:
On Mon, Jan 22, 2018, at 13:55, Steve Allen wrote:
The traditional calendar for observing at Lick Observatory has always had days begin at local noon. This means that the time zone for the Lick calendar is 20 hours behind Greenwich.
Wouldn't that make local noon midnight?
Yes, that's the whole point. Astronomers (the non-solar kind) like it when the date doesn't change in the middle of their work day. If you use am/pm style timestamps, it gets somewhat confusing (unless you read "m" as "media nocte" rather than "meridiem" :-) ) but with 24 hour timestamps it's pretty reasonable. paul
On 2018-01-22 11:55, Steve Allen wrote:
On Mon 2018-01-22T18:47:07+0000 Stephen Colebourne hath writ:
Does newer Java code (Java 9, ThreeTen-Backport) also have these limitations? No. The biggest limit is that offsets are constrained to -18 to +18 hours.
The traditional calendar for observing at Lick Observatory has always had days begin at local noon. This means that the time zone for the Lick calendar is 20 hours behind Greenwich. I implore all implementors to accommodate local time zones which are not limited to the list that is part of tzdb.
In a similar vein, there may still be some legacy code in some PPoE(s), which for scheduling treats legal 09.00 C{S|D}T as the start of the day, various periods as subdivisions of that day, Nov 1 as the start of the year, Nov-Mar and Apr-Oct as seasons, with appropriately adjusted treatment of weeks and months. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
On 01/22/2018 10:47 AM, Stephen Colebourne wrote:
This will happen in the perfect scenario where everything is updated at once. The problem is that the perfect scenario is the anomaly. In many perfectly reasonable scenarios, one will be updated without the other being updated, causing nonsense output.
Could you be more precise about what the "one" is, and what the "other" is? Is "one" the Java 9 code, and the other the timezone data being supplied to the Java 9 runtime? For example, does the Java 9 code have time zone abbreviations wired into it? and if not, then what sort of discrepancies would be observed at the API level if (say) the Java code is updated but the Java timezone data are not? I'm not worried about the old Java code here; it can continue to use old-format Java tables if that is needed. I'm worried about what would be the problems when using Java code that has been updated to support negative DST, when this new code is given updated timezone data.
Moreover, having the operating system (via zic) run on different rules to the applications that run it seems destined to end in tears. That bridge was crossed long ago, it seems, if Java-based applications have long been reporting information for old Irish timestamps that disagrees with what GNU/Linux, FreeBSD etc. report for the same timestamps. If we could fix future Java implementations to support negative DST offsets, it appears that we could remove some of these longstanding minor discrepancies between Java and other implementations.
On 22 January 2018 at 19:18, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 01/22/2018 10:47 AM, Stephen Colebourne wrote:
This will happen in the perfect scenario where everything is updated at once. The problem is that the perfect scenario is the anomaly. In many perfectly reasonable scenarios, one will be updated without the other being updated, causing nonsense output.
Could you be more precise about what the "one" is, and what the "other" is?
Remember, there are 2 different data elements here: - tzdb data - CLDR-driven text data Java time-zone data is updated using the tzupdater tool http://www.oracle.com/technetwork/java/javase/tzupdater-readme-136440.html. This will update the tzdb data, but not the CLDR-driven data that drives the text. Were the change to proceed, anyone running tzupdater with the Ireland change would invert the meaning of inDaylightTime() and access the wrong array element in the CLDR-driven data - a bug. And code changes don't help, as we'll see below.
If we could fix future Java implementations to support negative DST offsets, it appears that we could remove some of these longstanding minor discrepancies between Java and other implementations.
There is no possible fix to Java, as this is primarily an issue between CLDR and TZDB. The two have a subtle API linkage which has perhaps never been clearly spelled out here. CLDR provides textual names for time-zones, as an array [winter, summer]. As a much larger project with considerable history the order of that array is not going to change. (I'm using winter and summer for CLDR for this email to aid clarity, they refer to them as standard and daylight). TZDB provides the offsets, SAVE values and a short text string. This text string - GMT/IST or IST/GMT - is not directly linkable to the data CLDR provides. Although it may seem that you can use the text from TZDB as a key to lookup the correct value in CLDR, I know from painful experience that approach fails (as the TZDB text varies over time, has the same text in winter and summer, or isn't even text). Thus, the only reliable way to pick which piece of CLDR data is needed is from the offsets. For 20 years, this has been done in a simple and straightforward way - if (raw-offset != actual-offset) then CLDR uses summer text and array element 1. This provides the necessary glue to link the two projects: boolean inSummerTime(instant) { return getRawOffset(instant) != getActualOffset(instant) } zoneName = inSummerTime(instant) ? cldr-summer-time-text : cldr-winter-time-text TZDB has always had the raw and actual offsets the same in winter and different in summer, so this has always worked. It has become the API between the two projects without anyone really noticing. The Ireland proposal breaks this, with (raw-offset != actual-offset) meaning winter, instead of summer. It is fair for TZDB to complain that CLDR is inflexible with its definitions, but the reality is that this was and is the only way to connect two separately developed projects (where API stability is vital). In order for TZDB and CLDR to co-exist, it is *required* that the raw offset equals the actual offset in winter, and that they differ in summer. This fact *requires* positive SAVE values and blocks negative ones. This isn't a change that can be delayed for a year. This interpretation of inSummerTime() relies on positive SAVE values, and is part of the public API of TZDB just as much as the source code file format is. In fact, it is the only way that TZDB and CLDR communicate. In summary, negative SAVE values break the long-standing API with CLDR, and thus break any project that relies on both, such as Java. Negative SAVE value simply cannot exist without breaking the much broader ecosystem of which TZDB is only a very small part. Its time to close the door on negative SAVE values in TZDB permanently. Stephen
On Jan 23, 2018, at 10:42 AM, Stephen Colebourne <scolebourne@joda.org> wrote:
There is no possible fix to Java, as this is primarily an issue between CLDR and TZDB. The two have a subtle API linkage which has perhaps never been clearly spelled out here.
Perhaps the contract between the tzdb and its users needs to be made more detailed here, with the tzdb explicitly saying what it guarantees and what it doesn't guarantee; the tzdb doesn't change what it guarantees; no users ever depend on what it doesn't guarantee; and with some stuff that the tzdb currently doesn't guarantee perhaps becoming guaranteed because there are clients that can't just stop depending on it.
CLDR provides textual names for time-zones, as an array [winter, summer]. As a much larger project with considerable history the order of that array is not going to change. (I'm using winter and summer for CLDR for this email to aid clarity, they refer to them as standard and daylight).
TZDB provides the offsets, SAVE values and a short text string. This text string - GMT/IST or IST/GMT - is not directly linkable to the data CLDR provides. Although it may seem that you can use the text from TZDB as a key to lookup the correct value in CLDR, I know from painful experience that approach fails (as the TZDB text varies over time, has the same text in winter and summer, or isn't even text). Thus, the only reliable way to pick which piece of CLDR data is needed is from the offsets.
For 20 years, this has been done in a simple and straightforward way - if (raw-offset != actual-offset) then CLDR uses summer text and array element 1. This provides the necessary glue to link the two projects:
boolean inSummerTime(instant) { return getRawOffset(instant) != getActualOffset(instant) } zoneName = inSummerTime(instant) ? cldr-summer-time-text : cldr-winter-time-text
OK, so "instant" isn't passed to localtime() or localtime_r(), or to code in CLDR that does the same thing that those functions do, to get tm_isdst or the equivalent information? How does CLDR determine those offsets?
CLDR provides textual names for time-zones, as an array [winter, summer]. As a much larger project with considerable history the order of that array is not going to change. (I'm using winter and summer for CLDR for this email to aid clarity, they refer to them as standard and daylight).
TZDB provides the offsets, SAVE values and a short text string. This text string - GMT/IST or IST/GMT - is not directly linkable to the data CLDR provides. Although it may seem that you can use the text from TZDB as a key to lookup the correct value in CLDR, I know from painful experience that approach fails (as the TZDB text varies over time, has the same text in winter and summer, or isn't even text). Thus, the only reliable way to pick which piece of CLDR data is needed is from the offsets.
For 20 years, this has been done in a simple and straightforward way - if (raw-offset != actual-offset) then CLDR uses summer text and array element 1. This provides the necessary glue to link the two projects:
boolean inSummerTime(instant) { return getRawOffset(instant) != getActualOffset(instant) } zoneName = inSummerTime(instant) ? cldr-summer-time-text : cldr- winter-time-text
OK, so "instant" isn't passed to localtime() or localtime_r(), or to code in CLDR that does the same thing that those functions do, to get tm_isdst or the equivalent information?
How does CLDR determine those offsets?
CLDR does not determine offsets. CLDR just maintains an array of names by category. In CLDR, we define several different type of names for a zone (and localized names in various locales) - 1. Long standard (e.g. Pacific Standard Time) 2. Long daylight (e.g. Pacific Daylight Time) 3. Long generic (e.g. Pacific Time) 4. Short standard (e.g. PST) 5. Short daylight (e.g. PDT) 6. Short generic (e.g. PT) And the set of name may change time to time for a single location. The problem is that CLDR currently uses "Irish Standard Time" for 2. Long daylight, and "Greenwich Mean Time" for 1. Long standard. CLDR consumers, such as Java, ICU, node, etc.. rely on the labeling, but handling actual UTC offset separately. If CLDR strictly follows the 2018a Dublin rules, then a consumer code without the change suddenly flips summer/winter names. As Stephen Colebourne mentioned, this is the most difficult part for library/platform maintainer. CLDR downstream consumers usually maintain code (calculating clocks) and localized time zone name data separately from different sources. Usually, localized time zone name data is assumed as more stable data, and consumers of CLDR assumes it does not require frequent updates. So they usually don't have mechanism for updating name data only. -Yoshito
On Jan 23, 2018, at 11:55 AM, Yoshito Umaoka <yoshito_umaoka@us.ibm.com> wrote:
CLDR does not determine offsets.
Stephen Colebourne claimed that CLDR determines whether to use the standard or daylight time strings by comparing the "raw offset" (presumably meaning "the offset during standard time") with the "actual offset" (presumably meaning "the offset during daylight savings time"). Therefore, it *must* know those offsets, otherwise it cannot compare them. So let me rephrase the question: How does CLDR obtain those offsets?
CLDR does not determine offsets.
Stephen Colebourne claimed that CLDR determines whether to use the standard or daylight time strings by comparing the "raw offset" (presumably meaning "the offset during standard time") with the "actual offset" (presumably meaning "the offset during daylight savings time").
Therefore, it *must* know those offsets, otherwise it cannot compare
them.
So let me rephrase the question:
How does CLDR obtain those offsets?
CLDR only maintains names for each type in XML format. CLDR XML (or JSON) data is consumed by other projects such as ICU and Java, and these external projects know those offsets. CLDR only specifies daylight saving time name used for Europe/Dublin is "Irish Standard Time". ICU/Java imports zoneinfo from tz database, and obtain offset at a given time, then decide whether it's in standard time or daylight time. When ICU/Java display a date with time zone, it just load the display name separately imported from CLDR - with time zone ID Europe/Dublin. (The logic explained here is a little bit simplified actually. CLDR also provide historic name mapping changes, and ICU utilizes the data as well.) Because the runtime code such as ICU and Java detects offsets, and separately imported data for CLDR maintain display names, they must be updated at the same time. -Yoshito
On Jan 24, 2018, at 1:17 PM, Yoshito Umaoka <yoshito_umaoka@us.ibm.com> wrote:
CLDR XML (or JSON) data is consumed by other projects such as ICU and Java, and these external projects know those offsets. CLDR only specifies daylight saving time name used for Europe/Dublin is "Irish Standard Time". ICU/Java imports zoneinfo from tz database, and obtain offset at a given time, then decide whether it's in standard time or daylight time.
The tz binary database has, for all transition times, an indication of whether, after the transition, you are in "DST". If the tz binary database is what Java time zone code imports, it doesn't need to look at offsets to determine whether the times are standard or "DST", it can just use those values. (I say "DST" because that's used to set tm_isdst.) It does *not* contain any offsets other than, for each transition, what the offset from UTC is. Thus, it provides no notion of "raw-offset" vs. "actual-offset", and you can't determine both a "raw-offset" and an "actual-offset" from the tz binary database without either 1) additional data or 2) some possibly-incorrect assumptions being made, such as "the only reason why an entry in the table of transitions has a different tt_gmtoff value is that the transition represents starting or ending DST" (that latter assumption has been false for a very very very very very long time for some tzdb regions, as a given region might switch from one time zone to another). The tzdb *source* files, however, give the "standard" offset from UTC in zone lines and the "amount to save", to be added to the "standard" offset, in rule lines, so code that parses those files independently, rather than relying on the binary files produced by zic parsing the files, can get both the "standard" and "current" offset from UTC. Which of those two things does the Java code that "imports zoneinfo from tz database do? Does it read the binary data, or independently read the source data (or read binary data produced by a parser other than zic)?
Guy Harris <guy@alum.mit.edu> wrote on 01/24/2018 05:17:24 PM:
From: Guy Harris <guy@alum.mit.edu> To: Yoshito Umaoka <yoshito_umaoka@us.ibm.com> Cc: Stephen Colebourne <scolebourne@joda.org>, Time Zone Mailing List <tz@iana.org>, tz <tz-bounces@iana.org> Date: 01/24/2018 05:17 PM Subject: Re: [tz] OpenJDK/CLDR/ICU/Joda issues with Ireland change
On Jan 24, 2018, at 1:17 PM, Yoshito Umaoka <yoshito_umaoka@us.ibm.com> wrote:
CLDR XML (or JSON) data is consumed by other projects such as ICU and Java, and these external projects know those offsets. CLDR only specifies daylight saving time name used for Europe/ Dublin is "Irish Standard Time". ICU/Java imports zoneinfo from tz database, and obtain offset at a given time, then decide whether it's in standard time or daylight time.
The tz binary database has, for all transition times, an indication of whether, after the transition, you are in "DST". If the tz binary database is what Java time zone code imports, it doesn't need to look at offsets to determine whether the times are standard or "DST", it can just use those values. (I say "DST" because that's used to set tm_isdst.)
I cannot speak for Java. ICU does not use the tz binaries - ICU generates own binary resources for tzdata source files. The information equivalent to tm_isdst is stored in the ICU binary format. In addition to this, ICU also store raw-offset and DST saving amount, that is not available in the tz binaries. ICU preserve the information for supporting some legacy APIs - getRawOffset, etc..
It does *not* contain any offsets other than, for each transition, what the offset from UTC is. Thus, it provides no notion of "raw- offset" vs. "actual-offset", and you can't determine both a "raw- offset" and an "actual-offset" from the tz binary database without either 1) additional data or 2) some possibly-incorrect assumptions being made, such as "the only reason why an entry in the table of transitions has a different tt_gmtoff value is that the transition represents starting or ending DST" (that latter assumption has been false for a very very very very very long time for some tzdb regions, as a given region might switch from one time zone to another).
The tzdb *source* files, however, give the "standard" offset from UTC in zone lines and the "amount to save", to be added to the "standard" offset, in rule lines, so code that parses those files independently, rather than relying on the binary files produced by zic parsing the files, can get both the "standard" and "current" offset from UTC.
Correct. As I explained above, ICU modified zic also store raw (standard) offset and DST amount.
Which of those two things does the Java code that "imports zoneinfo from tz database do? Does it read the binary data, or independently read the source data (or read binary data produced by a parser otherthan
zic)? CLDR and ICU are two separate projects, although CLDR was originally a part of ICU project historically. Our biggest issue with the change in 2018a/b was not actually negative DST offset. The bigger issue is swapping standard/daylight saving names. (Although, it's still a problem to adopt such rule, because we have a bug in our code invalidating the negative DST saving amount in all ICU versions released in the past, and need to distribute a patch to handle such case.) At this moment, the TZ database project does nothing with i18n. Names used for displaying time zones are pretty much US centric. But there are many other external projects that want to utilize the rules for clock changes. CLDR is trying to provide localized expression of time zone names in various different languages. CLDR sets an assumption that name of zones are very stable. For example, "Pacific Standard Time" represents standard time used on US Pacific coast and the name itself does not change time to time. However, transition rules are changing much more frequently, thus there are many releases of new tz database. To localize time zone display name, CLDR needs to assign a unique key to each translatable text. And CLDR uses a combination of zone ID and standard/daylight difference. Because names are assumed as very stable, a consumer of CLDR usually does not provide a mechanism to distribute updated names. Of course, if CLDR and ICU are one project and data is only consumed by ICU, then it's relatively easy to adopt such change. We just need to update zone name data and code handling the clock at the same time. But they are two separate projects, and CLDR is consumed by numbers of other projects, that does not have any controls for clock calculation. So such change could easily break downstream consumers, who utilizes the TZ database. I'm not sure what we want to do in CLDR if this change is brought back to the TZ database at this moment. CLDR technical committee may decide not to make corresponding change, instead, we might just change the definition of keys assigned to each zone name strings. Thanks, Yoshito (ICU/CLDR)
On 01/24/2018 05:28 PM, Yoshito Umaoka wrote:
CLDR sets an assumption that name of zones are very stable. For example, "Pacific Standard Time" represents standard time used on US Pacific coast and the name itself does not change time to time.
Could you clarify how CLDR currently works for Ireland, without the proposed tzdb changes? tzdb's current data (which is the same as what it was in 2017c) has three types of Irish timestamps that use the abbreviation "IST". The first type is for UT+00:34:39 and was observed in summer 1916; it has tm_isdst=1. The second type is for UT+01 and was observed in summers from 1922 through late 1940, then continuously until late 1948, then in summers through 1968, and then in summers from 1972 through today; it also has tm_isdst=1. The third kind is also for UT+01 and was observed from late 1968 through late 1971; it has tm_isdst=0. Are all three types of IST called "Irish Standard Time" in CLDR now? If not, then what does CLDR call them and how is this determined? And if so, we have a problem since the correct full name for IST is "Irish Summer Time" for timestamps before late 1968, and is "Irish Standard Time" for timestamps thereafter, and there's nothing in the tzdb data proper that specifies the transition date between the two full names. CLDR is not cast in stone: CLDR called IST "Irish Summer Time" until CLDR 26 came out in 2014 - this fixed a bug with post-1968 timestamps at the cost of introducing a bug for pre-1968 timestamps. I'm hoping that there is some way that we can fix this problem, a problem that exists regardless of whether negative DST offsets are used. Perhaps CLDR could be extended somehow, so that its reports the proper full names for time zones even if that info is not always deducible from the tzdb data proper.
CLDR sets an assumption that name of zones are very stable. For example, "Pacific Standard Time" represents standard time used on US Pacific coast and the name itself does not change time to time.
Could you clarify how CLDR currently works for Ireland, without the proposed tzdb changes? tzdb's current data (which is the same as what it
was in 2017c) has three types of Irish timestamps that use the abbreviation "IST". The first type is for UT+00:34:39 and was observed in summer 1916; it has tm_isdst=1. The second type is for UT+01 and was observed in summers from 1922 through late 1940, then continuously until
late 1948, then in summers through 1968, and then in summers from 1972 through today; it also has tm_isdst=1. The third kind is also for UT+01 and was observed from late 1968 through late 1971; it has tm_isdst=0.
Are all three types of IST called "Irish Standard Time" in CLDR now? If not, then what does CLDR call them and how is this determined? And if so, we have a problem since the correct full name for IST is "Irish Summer Time" for timestamps before late 1968, and is "Irish Standard Time" for timestamps thereafter, and there's nothing in the tzdb data proper that specifies the transition date between the two full names.
CLDR does not have time zone names for dates before 1990. Historic zone names never used in a last few decades are not included. This is mainly because reducing overhead of managing localized names. Our primary focus is to provide good localized display names in modern software, not trying to provide every possible names historically used. CLDR suggests code implementators to use a UTC offset format as the fallback, for example, UTC+01:00 as the fallback when a name is not available. (CLDR also provides localized fallback format patterns in various locales). BTW, CLDR localized names are also based on ordinary people's expectation in each locale. For example, while people in Ireland most likely recognize "IST" as "Irish Standard Time", but people in other countries usually do not recognize what "IST" is. In CLDR, these zone abbreviations are managed as "short" names, and the coverage of short names is sparse in each locale. In this example, locale en-US does not have short name for Irish Standard Time. "IST" is used as the short name only in locale en-IE (IE = Ireland). Again, when a zone name is missing, CLDR suggests code implementators to use the fallback format.
CLDR is not cast in stone: CLDR called IST "Irish Summer Time" until CLDR 26 came out in 2014 - this fixed a bug with post-1968 timestamps at
the cost of introducing a bug for pre-1968 timestamps. I'm hoping that there is some way that we can fix this problem, a problem that exists regardless of whether negative DST offsets are used. Perhaps CLDR could be extended somehow, so that its reports the proper full names for time zones even if that info is not always deducible from the tzdb data proper.
So, the issue between Irish Summer Time and Irish Standard Time is irrelevant to CLDR with our current scope. -Yoshito
On 25/01/18 02:48, Yoshito Umaoka wrote:
CLDR does not have time zone names for dates before 1990. Historic zone names never used in a last few decades are not included. This is mainly because reducing overhead of managing localized names. Our primary focus is to provide good localized display names in modern software, not trying to provide every possible names historically used.
Which is only exacerbating the problem ... if CLDR is unable to support pre 1990 dates properly then we do need another service that can cope! Genealogical data and history in general can't simply be swept under the carpet ... TZ while not promoting pre 1970 data does at least have it available. Now I know CLDR does not support historic material I can make sure I avoid using it at all! -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
CLDR does not have time zone names for dates before 1990. Historic zone names never used in a last few decades are not included. This is mainly because reducing overhead of managing localized names. Our primary focus is to provide good localized display names in modern software, not trying to provide every possible names historically used.
Which is only exacerbating the problem ... if CLDR is unable to support pre 1990 dates properly then we do need another service that can cope! Genealogical data and history in general can't simply be swept under the
carpet ... TZ while not promoting pre 1970 data does at least have it available. Now I know CLDR does not support historic material I can make
sure I avoid using it at all!
CLDR is able to support names used for pre 1990 dates only technically. If CLDR project really wants to handle the distinction between "Irish Summer Time" and "Irish Standard Time", it's doable. However, CLDR technical committee does not want to take an effort to localize zone names never used in last 30 years for hundreds of locales. It's just a policy issue. -Yoshito
On 01/25/2018 06:28 AM, Yoshito Umaoka wrote:
If CLDR project really wants to handle the distinction between "Irish Summer Time" and "Irish Standard Time", it's doable.
However, CLDR technical committee does not want to take an effort to localize zone names never used in last 30 years for hundreds of locales.
I can understand that. However, it does make sense for the CLDR committee to fix the problem for Irish time now, even if it doesn't fix similar problems in the other hundreds of locales. Partly this is because the issue has come up for Ireland when combining CLDR with tzdb. But more importantly, if the issue is coming up for Ireland's past now, then it's quite plausible that it will come up for some other country in the not-too-distant future when they change their rules. It would be better to have some practical experience with such situations as opposed to armchair theorizing, and one way to do that would be to add support for historical Irish time now, and to make sure that this support works when combined with tzdb.
On 2018-01-25 06:00, Lester Caine wrote:
On 25/01/18 02:48, Yoshito Umaoka wrote:
CLDR does not have time zone names for dates before 1990. Historic zone names never used in a last few decades are not included. This is mainly because reducing overhead of managing localized names. Our primary focus is to provide good localized display names in modern software, not trying to provide every possible names historically used.
Which is only exacerbating the problem ... if CLDR is unable to support pre 1990 dates properly then we do need another service that can cope! Genealogical data and history in general can't simply be swept under the carpet ... TZ while not promoting pre 1970 data does at least have it available. Now I know CLDR does not support historic material I can make sure I avoid using it at all!
CLDR does not do data, only names, and like recent tzdb reversals on support of invented abbreviations, if CLDR can not provide a name, it provides a UTC offset string. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
On 01/24/2018 06:48 PM, Yoshito Umaoka wrote:
CLDR does not have time zone names for dates before 1990.
Thanks for the info. That simplifies things!
In this example, locale en-US does not have short name for Irish Standard Time. "IST" is used as the short name only in locale en-IE (IE = Ireland). Again, when a zone name is missing, CLDR suggests code implementators to use the fallback format.
This is also helpful. It suggests that we can fix the short names at the CLDR level by removing the text "<daylight>IST</daylight>" from the en-IE locale. That way, implementers can fall back for Irish time in Ireland the same way that they fall back for Irish time in the US (or for US time in the US, for that matter). This leaves the long names, but there is a simple workaround for that, too. We can change the long names for standard and daylight saving time in Ireland to be just "Irish Time". This string will work regardless of whether UTC+00 or UTC+01 is considered to be standard time in Ireland, so it will be portable to both the current and the proposed tzdb data. Although it is not ideal for either the old or the new approaches, it is a reasonable compromise that is compatible with both. So: how about the attached patch to CLDR? I have not tested this: I'm just suggesting it as an idea for moving forward. If this patch does not work, I hope that a similar patch would work.
This is also helpful. It suggests that we can fix the short names at the
CLDR level by removing the text "<daylight>IST</daylight>" from the en-IE locale. That way, implementers can fall back for Irish time in Ireland the same way that they fall back for Irish time in the US (or for US time in the US, for that matter).
This leaves the long names, but there is a simple workaround for that, too. We can change the long names for standard and daylight saving time in Ireland to be just "Irish Time". This string will work regardless of whether UTC+00 or UTC+01 is considered to be standard time in Ireland, so it will be portable to both the current and the proposed tzdb data. Although it is not ideal for either the old or the new approaches, it is
a reasonable compromise that is compatible with both.
So: how about the attached patch to CLDR? I have not tested this: I'm just suggesting it as an idea for moving forward. If this patch does not
work, I hope that a similar patch would work.
CLDR set an assumption that standard and daylight names must be different. So the patch does not work. (BTW, you might disagree that we set such assumption. I think TZ database does not prohibit a same abbreviation used for standard and daylight at a time, although it is unlikely.) Paul, I really appreciate your thoughtful thinking for this issue. I read through suggestions from various people in this mailing list. In my honest opinion, the best option for us (as CLDR project contributor and consumer) is that TZ database to give up the change resulting standard/daylight name swapped for Ireland. But I also understand TZ database wants to fix the mistake introduced long time ago. If TZ database project insists to make the change, CLDR project has to decide whether we should swap standard/dalight name in CLDR, and if we do, when is the right time (CLDR has 2 official releases every year, although time zone related metadata (not display names) are updated in repository when necessary). Since this issue came up, we really did not have chance to talk about this topic deeply in CLDR project, because we're currently busy on cleaning up updated locale data (not just for time zones) for next release (CLDR 33, March). Please let me bring this topic to CLDR project team. We want to discuss possible options and impacts there. I cannot simply tell you we're going to do without other technical committee members' agreement. Thanks, Yoshito
Yoshito Umaoka wrote:
CLDR set an assumption that standard and daylight names must be different.
That assumption is easy enough to satisfy. How about the attached patch instead? (Where is the assumption documented, by the way?)
(BTW, you might disagree that we set such assumption. I think TZ database does not prohibit a same abbreviation used for standard and daylight at a time, although it is unlikely.)
Not only does tzdb not prohibit it, that feature was long used in Australian timestamps, as it matched more-traditional Australian practice. One can still follow this more-traditional practice by using POSIX TZ settings like TZ='EST-10EST,M10.1.0,M4.1.0/3' which uses "EST" for both Eastern Standard Time and Eastern Summer Time. If CLDR assumes that names or abbreviations must be unique, that's a problem that should get fixed somehow.
If TZ database project insists to make the change, CLDR project has to decide whether we should swap standard/dalight name in CLDR, and if we do, when is the right time
I'm proposing a patch so that the CLDR project doesn't need to make such an abrupt swap. CLDR can have a transition period as long as you like, during which CLDR will work with both current and proposed tzdb.
On 26 January 2018 at 18:29, Paul Eggert <eggert@cs.ucla.edu> wrote:
I'm proposing a patch so that the CLDR project doesn't need to make such an abrupt swap. CLDR can have a transition period as long as you like, during which CLDR will work with both current and proposed tzdb.
This merely replaces correct text with incorrect text. Why should the citizens of Ireland have worse textual zone names than everyone else merely to satisfy TZDB's need for purity? Why is TZDB so important that every other part of the software stack should be worse? Stephen
On 2018-01-26 18:56:04 (+0000), Stephen Colebourne wrote:
On 26 January 2018 at 18:29, Paul Eggert <eggert@cs.ucla.edu> wrote:
I'm proposing a patch so that the CLDR project doesn't need to make such an abrupt swap. CLDR can have a transition period as long as you like, during which CLDR will work with both current and proposed tzdb.
This merely replaces correct text with incorrect text.
Why should the citizens of Ireland have worse textual zone names than everyone else merely to satisfy TZDB's need for purity? Why is TZDB so important that every other part of the software stack should be worse?
It's not a matter of "purity". It's a matter of "correctness". Software that cannot cope with correct data is simply called "broken". As pointed out elsethread: people will keep messing with time. The fact that broken software has been able to cope until now is merely luck. It's unrealistic to expect the tzdb to get increasingly incorrect over time because broken software cannot cope with correct data. If tzdb is correct and other parts of the software stack are incorrect, those other parts are indeed "worse", as you say. Philip -- Philip Paeps Senior Reality Engineer Ministry of Information
On 01/26/2018 10:56 AM, Stephen Colebourne wrote:
This merely replaces correct text with incorrect text.
No it doesn't. The proposed text is not incorrect. It's merely more generic; enough to get us through a transition.
Why should the citizens of Ireland have worse textual zone names than everyone else It's not worse than everyone else. Lots of zones in CLDR have worse textual zone names, e.g., "Cook Islands Half Summer Time" (I mean, c'mon). Admittedly the proposal is a compromise and like any compromise there is something for everyone to dislike about it: but it is a way forward to more-accurate names in the future. Some sort of compromise is helpful because OpenJDK+CLDR cannot gracefully handle a transition to an environment where "Irish Standard Time" is standard time. The idea is to ease the transition by temporarily using names that are still accurate albeit more-generic. Users by and large won't notice or care about any temporary glitches, just as they by and large didn't notice or care when CLDR used the incorrect text "Irish Summer Time" for IST.
(BTW, you might disagree that we set such assumption. I think TZ database does not prohibit a same abbreviation used for standard and daylight at a time, although it is unlikely.)
Not only does tzdb not prohibit it, that feature was long used in Australian timestamps, as it matched more-traditional Australian practice. One can still follow this more-traditional practice by using POSIX TZ settings like TZ='EST-10EST,M10.1.0,M4.1.0/3' which uses "EST" for both Eastern Standard Time and Eastern Summer Time. If CLDR assumes that names or abbreviations must be unique, that's a problem that should get fixed somehow.
In this example, if "EST" cannot distinguish between standard time and summer time, then we don't use the abbreviations. The goal of CLDR is to provide names that people in a locale can reasonably understand - and standard/daylight names requires that people can distinguish one from another. In this case, "EST" can be "generic" name in CLDR. The concept of "generic" name is not in TZ db database. Anyway, CLDR tend to exclude many short names, because these abbreviations are not understood by people out of regions.
If TZ database project insists to make the change, CLDR project has to decide whether we should swap standard/dalight name in CLDR, and if we do, when is the right time
I'm proposing a patch so that the CLDR project doesn't need to make such an abrupt swap. CLDR can have a transition period as long as you like, during which CLDR will work with both current and proposed tzdb. [attachment "cldr.diff" deleted by Yoshito Umaoka/Westford/IBM]
I personally think we don't want to introduce artificial names introduced here. If "Irish Standard Time" is the official term, also recognized by people in Ireland, specifying time in summer there, CLDR should not change it just for this purpose. -Yoshito
From: "Yoshito Umaoka" <yoshito_umaoka@us.ibm.com> Date: Fri, 26 Jan 2018 14:32:49 -0500 Subject: Re: [tz] OpenJDK/CLDR/ICU/Joda issues with Ireland change | The goal of CLDR is to provide names that people in a locale | can reasonably understand - and standard/daylight | names requires that people can distinguish one from another. Why? What is the motivation for that? In real life, I see just the opposite - for better or worse, I get to see a fair amount of US sport on TV, and during that (never to ignore a commercial opportunity) there are often ads for other programs (which are generally not available here...) and from what I can see they are always advertised as being at 9 ET / 8 CT (or whatever) - that is, just "eastern time" with no mention of an S or a D. "Why would that be?" someone might ask... Because in practice no-one cares - all that matters is what time should a viewer switch on the TV to the relevant channel if they want to watch. That is, a sync between the event and wall clock time. What offset it happens to be from UTC this week, and whether it will be the same offset next week, is irrelevant. This is not to say that knowing the offsets is useless, there are applications for that .. it is just that the end user mostly does not care, and the long/short names don't seem to have any other use than to be presented to end users. After all, CLDR, one application you'd think would make use of the names that exist (without its versions) doesn't even bother to use tzdb's names - choosing instead to simply ignore them. If this applocation prefers to find some other way, how could be expect that any other would be different? The time zone is only mentioned in US ads because the US is a country with multipe zones, and with synchronised broadcasts, the clock time will be different in different regions. In countries without that, there is normally (from my observation) nothing more than the time. I suspect it is probably like that in Japan too, isn't it? That is why I asked for an example of something real, where the timezone information is actually used for some real practical purpose (just displaying it because we have it does not count.) So far (and I know it has not been very long) there has been nothing. | "generic" name is not in TZ db database. You're right, it isn't, and perhaps should be. For all zones, the relevant (English) string is probably "Time" (abbreviation, "T"). kre
On 2018-01-26 13:32, Robert Elz wrote:
That is why I asked for an example of something real, where the timezone information is actually used for some real practical purpose (just displaying it because we have it does not count.)
Calendar/scheduling apps have to use zones to set and adjust times, and email it in various ?cal formats. I believe those apps also use the ICU/CLDR Windows<->tz zone list for compatibility, and Outlook must also. Thunderbird's Lightning calendar/scheduling extension requires a selection from the tz zone list when selecting alternate times. I expect other unixy calendar and scheduling apps do the same, as they don't have the commercial systems budget for a gazetteer xref or fancy selection UI. Times show as America/Edmonton for me, except when I select another for travel or other purposes. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
| The goal of CLDR is to provide names that people in a locale | can reasonably understand - and standard/daylight | names requires that people can distinguish one from another.
Why? What is the motivation for that?
In real life, I see just the opposite - for better or worse, I get to see a fair amount of US sport on TV, and during that (never to ignore a commercial opportunity) there are often ads for other programs (which are generally not available here...) and from what I can see they are always advertised as being at 9 ET / 8 CT (or whatever) - that is, just "eastern time" with no mention of an S or a D.
"Why would that be?" someone might ask...
Because in practice no-one cares - all that matters is what time should a viewer switch on the TV to the relevant channel if they want to watch. That is, a sync between the event and wall clock time. What offset it happens to be from UTC this week, and whether it will be the same offset next week, is irrelevant.
This is not to say that knowing the offsets is useless, there are applications for that .. it is just that the end user mostly does not care, and the long/short names don't seem to have any other use than to be presented to end users. After all, CLDR, one application you'd think would make use of the names that exist (without its versions) doesn't even bother to use tzdb's names - choosing instead to simply ignore them. If this applocation prefers to find some other way, how could be expect that any other would be different?
That's why CLDR maintains "generic" names, such as "ET" "CT". And when someone really need distinction between Standard or Daylight, then they would use "EST" "EDT" etc. So the point is - whether someone wants to distinguish one from another if a region uses two alternative offsets within a year. CLDR "standard" "daylight" name is used for this use case - and therefore, they should be different. If such distinction is not necessary then CLDR suggests to use "generic" name.
The time zone is only mentioned in US ads because the US is a country with multipe zones, and with synchronised broadcasts, the clock time will be different in different regions. In countries without that, there is normally (from my observation) nothing more than the time. I suspect it is probably like that in Japan too, isn't it?
Right. Majority of countries in the world use only a single offset for each, and it's not common convention to put zone label/name along with time.
That is why I asked for an example of something real, where the timezone information is actually used for some real practical purpose (just displaying it because we have it does not count.)
The "generic" names are introduced after standard/daylight in CLDR. Library code like JDK does not support such concept. Because APIs in JDK and ICU (and some others) implements "parse" function, and there are not a ignorable numbers of consumers depending on round trip capability of date formatting function, these libraries need distinction between standard time and daylight saving time. (I know this is fragile, and consumer cannot expect it's always roundtripable.) But if such feature exists for number of years, we cannot simply break this silently, as a library provider.
So far (and I know it has not been very long) there has been nothing.
| "generic" name is not in TZ db database.
You're right, it isn't, and perhaps should be. For all zones, the relevant (English) string is probably "Time" (abbreviation, "T").
kre
On 01/26/2018 11:32 AM, Yoshito Umaoka wrote:
If "Irish Standard Time" is the official term, also recognized by people in Ireland, specifying time in summer there, CLDR should not change it just for this purpose.
If the CLDR insists on not changing that string, we can accommodate the transition in a more-complicated way, by adding a zone Europe/Dublin-ndst which is like Europe/Dublin except that it uses negative DST offsets. For forward compatibility we also add a zone Europe/Dublin-pdst that is like Europe/Dublin except that it uses only positive DST offsets; this will be helpful when tzdb's Europe/Dublin switches to negative DST offsets. Something like the attached patch to CLDR should do the trick. The idea is that this patch should not cause any incompatibilities with existing usage, because it doesn't change any existing data; it only adds new data intended to be useful during the transition.
On Jan 24, 2018, at 5:28 PM, Yoshito Umaoka <yoshito_umaoka@us.ibm.com> wrote:
Guy Harris <guy@alum.mit.edu> wrote on 01/24/2018 05:17:24 PM:
From: Guy Harris <guy@alum.mit.edu> To: Yoshito Umaoka <yoshito_umaoka@us.ibm.com> Cc: Stephen Colebourne <scolebourne@joda.org>, Time Zone Mailing List <tz@iana.org>, tz <tz-bounces@iana.org> Date: 01/24/2018 05:17 PM Subject: Re: [tz] OpenJDK/CLDR/ICU/Joda issues with Ireland change
On Jan 24, 2018, at 1:17 PM, Yoshito Umaoka <yoshito_umaoka@us.ibm.com> wrote:
CLDR XML (or JSON) data is consumed by other projects such as ICU and Java, and these external projects know those offsets. CLDR only specifies daylight saving time name used for Europe/ Dublin is "Irish Standard Time". ICU/Java imports zoneinfo from tz database, and obtain offset at a given time, then decide whether it's in standard time or daylight time.
The tz binary database has, for all transition times, an indication of whether, after the transition, you are in "DST". If the tz binary database is what Java time zone code imports, it doesn't need to look at offsets to determine whether the times are standard or "DST", it can just use those values. (I say "DST" because that's used to set tm_isdst.)
I cannot speak for Java. ICU does not use the tz binaries - ICU generates own binary resources for tzdata source files. The information equivalent to tm_isdst is stored in the ICU binary format.
So ICU then presumably has no need to check the raw and actual offsets to determine whether "DST" is in effect; it could just use the "is this DST?" information - and, if it could, it *should*. If that information is available to the Java code in question, it should do so as well. If not, it should be *made* available to the Java code.
In addition to this, ICU also store raw-offset and DST saving amount, that is not available in the tz binaries. ICU preserve the information for supporting some legacy APIs - getRawOffset, etc..
So by "legacy" I assume those APIs are deprecated. Note that class TimeZone: http://icu-project.org/apiref/icu4c/classicu_1_1TimeZone.html http://icu-project.org/apiref/icu4j/com/ibm/icu/util/TimeZone.html has, in both C++ and Java, a getRawOffset() member that returns "the TimeZone's raw GMT offset (i.e., the number of milliseconds to add to GMT to get local time, before taking daylight savings time into account)". [sic - it really says "GMT"] If a member of class OlsonTimeZone, which is a subclass of TimeZone, corresponds to a tzdb region as specified by a tzid such as "Europe/Berlin", how does it handle the tzdb region with the tzid "America/North_Dakota/New_Salem", given that, to quote the northamerica file: # Morton County, ND, switched from mountain to central time on # 2003-10-26, except for the area around Mandan which was already central time. # See <http://dmses.dot.gov/docimages/p63/135818.pdf>. # Officially this switch also included part of Sioux County, and # Jones, Mellette, and Todd Counties in South Dakota; # but in practice these other counties were already observing central time. # See <http://www.epa.gov/fedrgstr/EPA-IMPACT/2003/October/Day-28/i27056.htm>. Zone America/North_Dakota/New_Salem -6:45:39 - LMT 1883 Nov 18 12:14:21 -7:00 US M%sT 2003 Oct 26 2:00 -6:00 US C%sT and therefore its "raw GMT offset" was -7 hours prior to 02:00 local time, 2003-10-26, and -6 hours after that point, and given that getRawOffset() does *not* take a time as an argument. I.e., if a member of class OlsonTimeZone, which is a subclass of TimeZone, corresponds to a tzdb region, and if getRawOffset() is supposed to return "the" raw offset from UTC, deprecating getRawOffset is a Very Good Idea, as there are tzdb regions that simply *do not have a raw offset from UTC independent of time*. Is getDisplayName() also deprecated, for the same reason? For example, in the OlsonTimeZone for "America/North_Dakota/New_Salem", the values it returns would be based on Mountain Time prior to 2003-10-26 02:00 local time and Central Time after that. In ICU4C, useDaylightTime() "works" because it returns an "observes DST" indication for "the current (Gregorian) calendar year", although that won't work if a region chooses to start or stop observing DST at some time other than midnight, as, in the year when they switch, it "observes DST" for part of the year and doesn't "observe DST" for the rest of the year. In ICU4J, useDaylightTime() doesn't work in the general case, but observesDaylightTime() presumably works, as it checks whether "this time zone is in daylight saving time or will observe daylight saving time at any future time" - if "is in daylight savings time" means "is in daylight savings time at this instant in time" (the "any future time" part handles the time-of-check/time-of-use issue), then in a year where a switch is done, observesDaylightTime() would return true. (is that what the ICU4C useDaylightTime() does, too?)
Our biggest issue with the change in 2018a/b was not actually negative DST offset. The bigger issue is swapping standard/daylight saving names.
The issue that started this whole thing off is: http://www.irishstatutebook.ie/eli/1968/act/23/enacted/en/print "The time for general purposes in the State (to be known as standard time) shall be one hour in advance of Greenwich mean time throughout the year, and any reference in any enactment or any legal document (whether passed or made before or after the passing of this Act) to a specified point of time shall be construed accordingly unless it is otherwise expressly provided." Then there's http://www.irishstatutebook.ie/eli/1971/act/17/enacted/en/print "Notwithstanding section 1 (1) of the Standard Time Act, 1968, the time for general purposes in the State shall during a period of winter time be Greenwich mean time, and during such a period any reference in any enactment or any legal document (whether passed or made before or after the passing of this Act) to a specified point of time shall be construed accordingly unless it is otherwise expressly provided." which I read as saying "during Winter, we're one hour *behind* standard time". So "standard time" in Ireland doesn't mean the same thing it means in most other countries, and, presumably, a properly internationali{s,z}ed program would, in the middle of summer, report that Ireland is on Irish Standard Time or something such as that. Is that what currently happens, if the current locale is an Irish locale and the current tzdb region is the "Europe/Dublin" region, with code using the ICU and the CLDR? And a question for somebody familiar with Irish conventions - what do they call the time that's in effect during non-summer-time - "Greenwich Mean Time"?
Guy Harris said: [...]
and therefore its "raw GMT offset" was -7 hours prior to 02:00 local time, 2003-10-26, and -6 hours after that point, and given that getRawOffset() does *not* take a time as an argument.
Again, this happened in the UK when we went on to and then off of British Standard Time. And it's the origin of the Irish Problem.
In ICU4C, useDaylightTime() "works" because it returns an "observes DST" indication for "the current (Gregorian) calendar year", although that won't work if a region chooses to start or stop observing DST at some time other than midnight, as, in the year when they switch, it "observes DST" for part of the year and doesn't "observe DST" for the rest of the year.
Do you mean "midnight of New Year" by that? UK/Ireland had that as well.
So "standard time" in Ireland doesn't mean the same thing it means in most other countries, and, presumably, a properly internationali{s,z}ed program would, in the middle of summer, report that Ireland is on Irish Standard Time or something such as that.
Exactly. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On 01/24/2018 07:56 PM, Guy Harris wrote:
And a question for somebody familiar with Irish conventions - what do they call the time that's in effect during non-summer-time - "Greenwich Mean Time"?
I presume they call it that because that's what it is. :-) The Irish statutes in question call it "Greenwich mean time", without capitalizing the "mean" or "time". While we're on the subject of names, Ireland's Standard Time Act, 1968 gives no name to what we're calling "Irish Standard Time"; it merely uses the phrase "the time for general purposes in the State (to be known as standard time)".
CLDR doesn't formally have offsets. What it has is data like: en.xml <metazone type="America_Eastern"> <long> <generic>Eastern Time</generic> <standard>Eastern Standard Time</standard> <daylight>Eastern Daylight Time</daylight> </long> <short> <generic>ET</generic> <standard>EST</standard> <daylight>EDT</daylight> </short> </metazone> de.xml <metazone type="America_Eastern"> <long> <generic>Nordamerikanische Ostküstenzeit</generic> <standard>Nordamerikanische Ostküsten-Normalzeit</standard> <daylight>Nordamerikanische Ostküsten-Sommerzeit</daylight> </long> </metazone> (Note that abbreviations are only included if they'd be commonly recognized.) Each metazones represent a set names that can be used across multiple TZ ids. There can be overrides by TZ id, such as the current: <zone type="Europe/London"> <long> <daylight>British Summer Time</daylight> </long> </zone> <zone type="Europe/Dublin"> <long> <daylight>Irish Standard Time</daylight> </long> </zone> Both of these inherit <standard>Greenwich Mean Time</standard> So a current implementation, with no changes, will get for Europe/Dublin: <standard>Greenwich Mean Time</standard> <daylight>Irish Standard Time</daylight> Different clients will use the data in different ways. The source for ICU, for example, reformats to a key-value map: "meta:America_Eastern"{ ld{"Eastern Daylight Time"} lg{"Eastern Time"} ls{"Eastern Standard Time"} sd{"EDT"} sg{"ET"} ss{"EST"} } The normal interpretation of "standard" and "daylight" keywords or equivalents is that: standard_offset = 0 daylight_offset ≠ 0 Mark On Wed, Jan 24, 2018 at 10:44 AM, Guy Harris <guy@alum.mit.edu> wrote:
On Jan 23, 2018, at 11:55 AM, Yoshito Umaoka <yoshito_umaoka@us.ibm.com> wrote:
CLDR does not determine offsets.
Stephen Colebourne claimed that CLDR determines whether to use the standard or daylight time strings by comparing the "raw offset" (presumably meaning "the offset during standard time") with the "actual offset" (presumably meaning "the offset during daylight savings time").
Therefore, it *must* know those offsets, otherwise it cannot compare them.
So let me rephrase the question:
How does CLDR obtain those offsets?
On Wed, Jan 24, 2018, at 17:03, Mark Davis ☕️ wrote:
Each metazones represent a set names that can be used across multiple TZ ids. There can be overrides by TZ id, such as the current:
<zone type="Europe/London"> <long> <daylight>British Summer Time</daylight> </long> </zone> <zone type="Europe/Dublin"> <long> <daylight>Irish Standard Time</daylight> </long> </zone>
Both of these inherit
<standard>Greenwich Mean Time</standard>
So a current implementation, with no changes, will get for Europe/Dublin:
<standard>Greenwich Mean Time</standard> <daylight>Irish Standard Time</daylight>
So just to be clear, CLDR *could* support a Europe/Dublin that had an independent metazone with <standard>Irish Standard Time</standard><daylight>Greenwich Mean Time</daylight>, it's just that synchronizing such a change with the tzdb change is intractable, right? (that version number field is looking more and more important...)
So just to be clear, CLDR *could* support a Europe/Dublin that had an independent metazone with <standard>Irish Standard Time</ standard><daylight>Greenwich Mean Time</daylight>, it's just that synchronizing such a change with the tzdb change is intractable, right?
(that version number field is looking more and more important...)
If we need a single TZ DB zone has changing names time to time, we create what we call "meta zone", and define a set of short/long - standard/daylight /generic name for the meta zone. For example, ---- Zone America/North_Dakota/Center -6:45:12 - LMT 1883 Nov 18 12:14:48 -7:00 US M%sT 1992 Oct 25 2:00 -6:00 US C%sT ---- CLDR define meta zones <metazone type="America_Central"> <long> <generic>Central Time</generic> <standard>Central Standard Time</standard> <daylight>Central Daylight Time</daylight> </long> <short> <generic>CT</generic> <standard>CST</standard> <daylight>CDT</daylight> </short> </metazone> <metazone type="America_Mountain"> <long> <generic>Mountain Time</generic> <standard>Mountain Standard Time</standard> <daylight>Mountain Daylight Time</daylight> </long> <short> <generic>MT</generic> <standard>MST</standard> <daylight>MDT</daylight> </short> </metazone> These meta zone display names are translated for various locales. Then, we have historic zone -> meta zone mapping data as below: <timezone type="America/North_Dakota/Center"> <usesMetazone to="1992-10-25 08:00" mzone="America_Mountain"/> <usesMetazone from="1992-10-25 08:00" mzone="America_Central"/> </timezone> So, when CLDR consumer code such as ICU want to show time zone display name at a given time, it looks up which meta zone is used at the time, then check standard or daylight, then retrieve appropriate display name data type. With above example, if the input time is 2018-01-01T00:00:00Z, then meta zone "America_Central" is resolved with the meta zone mapping data above. ICU uses data generated from tz database and detect the date fall into standard time. Then zone name for standard in meta zone "America_Central" is used. At this moment, "Irish Standard Time" is set to zone directly, but if we need more historic names, we can create a new meta zone, define a set of name there, and add it to zone-to-meta-zone mapping data. -Yoshito
On Thu, Jan 25, 2018, at 14:52, Yoshito Umaoka wrote:
So just to be clear, CLDR *could* support a Europe/Dublin that had an independent metazone with <standard>Irish Standard Time</ standard><daylight>Greenwich Mean Time</daylight>, it's just that synchronizing such a change with the tzdb change is intractable, right?
(that version number field is looking more and more important...)
If we need a single TZ DB zone has changing names time to time, we create what we call "meta zone", and define a set of short/long - standard/daylight /generic name for the meta zone.
That interesting but wasn't really my question. What I was asking, regardless of the semantics of meta vs direct, was whether it would be theoretically workable (in a vacuum presuming everyone can be upgraded to a new tzdb [with winter negative "saving" and summer zero offset standard] concurrently with a version of CLDR data with such a change) for the CLDR to regard a "GMT" with a negative offset as <daylight> and IST as <standard>, even if that would mean it couldn't inherit the translations of "Greenwich Mean Time" from the British zone?
On 01/25/2018 11:52 AM, Random832 wrote:
(that version number field is looking more and more important...)
Is this version information file intended for actual distribution and/or distributed by anyone? I see that zic creates a `tzdata.zi` file, but it doesn't seem like Arch Linux or Debian are distributing it (Ubuntu does not seem great about updating their tzdata package, so I'm not sure if it's not there just because no one has updated tzdata since it's been around). It seems kinda important to have a consistent way of getting the version if you want people to be able to cope with possibly backwards incompatible-ish changes in the database.
On 01/25/2018 12:43 PM, Paul G wrote:
Is this version information file intended for actual distribution and/or distributed by anyone? Although the version file is intended to be stable and useful for distributors who want to install it, it wasn't added until 2016h, tzdb Makefile does not install it, and I don't know of any distributions that install it. The file tzdata.zi is installed by default and it contains a version number, but this wasn't added until 2018a and hardly anybody uses it in production now.
Is there a reason the Makefile doesn't install it? It would be really useful to have a stable installed-by-default version file of some sort. On 01/25/2018 04:20 PM, Paul Eggert wrote:
On 01/25/2018 12:43 PM, Paul G wrote:
Is this version information file intended for actual distribution and/or distributed by anyone? Although the version file is intended to be stable and useful for distributors who want to install it, it wasn't added until 2016h, tzdb Makefile does not install it, and I don't know of any distributions that install it. The file tzdata.zi is installed by default and it contains a version number, but this wasn't added until 2018a and hardly anybody uses it in production now.
On 01/25/2018 01:22 PM, Paul G wrote:
Is there a reason the Makefile doesn't install it? The main reason for me was a reluctance to separate version information from the data. Separate version files have a bad habit of containing wrong values. In contrast, I expect there's less of a problem putting a version number into tzdata.zi since that file contains all the tzdb data of interest to applications (except for leap seconds, which are to some extent versioned separately anyway). It's hard to tell for sure, since tzdata.zi is so new.
We could extended the format of zic binary output files to contain a version number, and then put the version number there as well. Although that might be better for some applications (which ones?), changing formats would be more work for all concerned and it's not clear that it'd be worth the effort.
In order to bridge the gap between the offset information in TZDB and the simple scheme generic/standard/daylight in CLDR, would it not be a good enhancement to TZDB source format to add an extra column per rule line to flag if the dst offset (equal if zero, negative or positive) has to be considered as associated with standard time labelling? Then the consumer would not have to reason about (dstOffset == 0) <=> isStandardTimeFormat() as actually done in Java software. Such a column could even be enhanced by an extra state for the ramadan situation in some Arabic countries when the clock only temporarily switches back (enabling a better name). The actual column containing an abbreviation is not really clear about this name-offset-association IMHO and can therefore not be evaluated by source code based tz-compilers. Am 25.01.2018 um 22:20 schrieb Paul Eggert:
On 01/25/2018 12:43 PM, Paul G wrote:
Is this version information file intended for actual distribution and/or distributed by anyone? Although the version file is intended to be stable and useful for distributors who want to install it, it wasn't added until 2016h, tzdb Makefile does not install it, and I don't know of any distributions that install it. The file tzdata.zi is installed by default and it contains a version number, but this wasn't added until 2018a and hardly anybody uses it in production now.
I'm afraid that I'm not understanding the proposal; could you write up Irish time as a specific example to help explain it? Remember, the problem is not whether Java can be changed to support negative DST offsets (or, for that matter, whether it can be changed to support the new feature you're proposing). The problem is how to transition from the old to the proposed system without breaking anything significant, even when some components are old and some are new.
An illustrating example: If the rule line contains an optional meta-column at the end whose content is a key-value-structure (comma-separated in case of several entries for any other purpose) then the Eire-rules might look like this: # Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S META Rule Eire 1971 only - Oct 31 2:00u -1:00 GMT category=winter Rule Eire 1972 1980 - Mar Sun>=16 2:00u 0 IST category=summer If the new META-column is missing at all or if its content does not contain an entry with key "category" (you are free to use a better key) then other tzdb-source-file consumers like Java or ICU are free to continue the assumption that winter time is to be determined by evaluating SAVE = 0 (current practice). Note that these consumers cannot evalute the column LETTERS/S as that column has no clear stable keys ("IST" would be specific for Ireland only and does not enable a general evaluation if it is connected to summer period). Then: After such a change, a source-file-tzdb-compiler can use this new information to determine the right entry in CLDR-data. If the value of key "category" is "winter" then use the CLDR-entry <standard>Greenwich Mean Time</standard> for getting the right time zone name (we leave out here the fact that CLDR does not support historized tz names but that is another problem not really related to this topic). If the value of key "category" is "summer" then use the alternate CLDR-entry <dailight>Irish Standard Time</daylight>. The new source format enhancement of tzdb only needs to be documented and should use standardized values like "winter", "summer" (or even "ramadan"). This way, CLDR-data don't even need to be changed at all so it is completely backwards compatible. The ICU- and Java-compilers would need a grace period to adjust their compilers and their own binary tz-format to process the new available informations (which should be not so difficult). If ICU or OpenJDK (Java) are still unwilling to change their current implementation then introducing negative dst offsets will inevitably break the way they determine the right tz label (resulting in "Irish Standard Time" in winter). But it would be their problem and responsibility. IMHO downstream consumers should be flexible and are expected to be flexible if they get the chance to fix their products after having got new informations in the tzdb source file format. I have adjusted my tzdb-compiler (Time4J) such that it looks if all rule lines referencing the same name (here: Eire) contain any negative dst offsets. If yes then let's assume summer time for SAVE=0 and winter time for SAVE < 0. So I can still work with old unchanged CLDR-entries for getting "Irish Standard Time" in summer. Although I am not quite sure if this way of fixing is stable enough for the future it demonstrates that it is possible for downstream consumers to cope with negative dst offsets already now. My proposal serves for making the handling of negative dst offsets easier and more stable. If any downstream consumers are still unwilling to handle negative dst offsets then it is just laziness for me. With best regards Meno Am 25.01.2018 um 22:53 schrieb Paul Eggert:
I'm afraid that I'm not understanding the proposal; could you write up Irish time as a specific example to help explain it? Remember, the problem is not whether Java can be changed to support negative DST offsets (or, for that matter, whether it can be changed to support the new feature you're proposing). The problem is how to transition from the old to the proposed system without breaking anything significant, even when some components are old and some are new.
From: Meno Hochschild <mhochschild@gmx.de> Date: Fri, 26 Jan 2018 05:49:44 +0100 Subject: Re: [tz] [english 100%] Re: [english 100%] Re: OpenJDK/CLDR/ICU/Joda issues with Irelandchange | An illustrating example: If the rule line contains an optional | meta-column at the end whose content is a key-value-structure | (comma-separated in case of several entries for any other purpose) then | the Eire-rules might look like this: Before we spend too much time on this, and considering that we have (well CLDR has) meta-zones, and you're proposing a meta-column, I have a meta-question ... Why do we need all this? That is, what end-user real applications actually use any of this data, what do they use it for, and what do they really need (or want) ? Now I know why (I suspect) all of us here need it - we need it because we have to comply with the POSIX APIs, and for historic reasons coming from 1970's vintage (US centric) unix, those APIs have time zone abbreviations, and tm_isdst, and stuff like that. So, the tzdb project has to provide the data that POSIX demands that applications be able to use. Simililarly, tzdb is all English (to which the "english 100%" which appears multiple times in the Subject of these messages refers) and that's not acceptable to international users, so CLDR provides the information in languages, and character sets, that are more suitable to people in places where "English 100%" is more like "English 1%"... All that is understandable, there is a need, and we happen to be the groups that fill that need. Given that, it can sometimes be hard to even contemplate, let alone accept, that all of this hard work, which we must perform, might simply be wasted, and of no real interest to anyone or anything important. So before we waste a lot more time designing fixes to this problem (which would mean enhancing CLDR to allow more than 2 non-generic names for a timezone, and all their users to fix their code to handle that) can we find out whether there is any point to all of this. There are two answers that are not interesting... 1) our test suite tests it, so we need to make it work before we can release new versions... (the test suite can be altered.) 2) we display this in our UI. Why? Because it is available. If you stopped displaying it, would it matter? Users would notice the difference and complain. Would it actually affect how your application works, or how the users use it? Not really, no. If (2) there ends with a "yes" answer, then exactly how the data is used, what its needed for, and what would break without it (this is assuming it is all working perfectly, ignore what happens if things change and cause errors, for now anyway) is what would be useful to know. Maybe it will turn out that all of this really is important (to at least some class of end users and the apps they use) in which case we need to go ahead and find solutions to the problems that are known to exist now. I don't think I have ever seen one. Ever. But of course, I don't have experience with *everything* that exists (nor even most of it) so I might just have missed something, somewhere. But if not, a better solution would be to get posix to simply (even more) deprecate all the old trash, so we don't have to provide it, and all of us can go back to doing work that actually provides a useful service rather than wasting lots of time looking for solutions to problems that no-one really cares about. kre
On 26/01/18 07:15, Robert Elz wrote:
But if not, a better solution would be to get posix to simply (even more) deprecate all the old trash, so we don't have to provide it, and all of us can go back to doing work that actually provides a useful service rather than wasting lots of time looking for solutions to problems that no-one really cares about.
The reality today is that timezones have never been properly handled and much of the 'old trash' was a cluge at the time that never addressed the whole problem. People keep asking why we need to worry about historic times that 'no-one really cares about' but while the accuracy of much historic material may be questionable, the fact that events prior to 1970 (or 1990) happened at certain times relative to other events IS a simple fact. If one can't relay on the calender to CORRECTLY compare events then there needs to be a big red flag that says so. That an historic event happened at xxx in Irish Standard Time for example means it happened at the offset being used then and not some current setting, so while many programmers probably have no interest in even last years history, a LOT of people are now investigating and comparing material world wide that NEEDS a reliable clock which quite simply is not currently available in ANY operating system? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On 26 January 2018 at 07:15, Robert Elz <kre@munnari.oz.au> wrote:
Why do we need all this?
Because applications have APIs that they want to continue to support in a backwards compatible way. Even if everything were deprecated today, those APIs would need to be supported for at least 10 years and probably more. And anyway, I disagree that these APIs should be deprecated. Just because zic/posiz doesn't expose data doesn't mean its not useful.
all of us can go back to doing work that actually provides a useful service rather than wasting lots of time looking for solutions to problems that no-one really cares about.
TZDB has known about the Ireland issue since 2005. No change was needed at all. No benefit has accrued, only cost. The change was and is inadvisable. Numerous people have criticised the change. And over 150 emails have been sent. All for something where the zic binary output doesn't even care which way the source is defined!!!! Stephen
Stephen Colebourne wrote:
Because applications have APIs that they want to continue to support in a backwards compatible way. Even if everything were deprecated today, those APIs would need to be supported for at least 10 years and probably more.
When and how forcefully to deprecate something really is one of the key questions here. But I think a big part of the disconnect is that, in many people's minds, the isdst-related portions of the TZ API have already been pretty severely deprecated for at least 10 years (maybe even 20). So, quite aside from all the discussions over the specific Ireland case, it's worth asking: If the TZ project isn't already deprecating timezone and tm_isdst, should it be, and how strongly, and using what language in which documents? And what about POSIX? Is there any way of getting them to deprecate those inadequate old interfaces, and perhaps standardize tm_gmtoff while they're at it? Do we know how many programs are using timezone and tm_isdst (and the variable that says "this zone observes DST sometimes", and the other difficult API bits we've been discussing here)? Do we know why they think they have to use them? Is there guidance we can give them to help wean them off them?
Steve Summit wrote:
If the TZ project isn't already deprecating timezone and tm_isdst, should it be, and how strongly, and using what language in which documents?
Good question. Proposed patch attached.
And what about POSIX? Is there any way of getting them to deprecate those inadequate old interfaces, and perhaps standardize tm_gmtoff while they're at it?
Sure, one can file bug reports and join the committee and discuss. It's the usual committee thing. In my experience the POSIX folks are quite reasonable. It does take time, though.
Do we know how many programs are using timezone and tm_isdst (and the variable that says "this zone observes DST sometimes", and the other difficult API bits we've been discussing here)? Do we know why they think they have to use them? Is there guidance we can give them to help wean them off them?
They're used mostly because of inertia and/or inexperience, I think. The proposed patch attempts to give guidance.
On 2018-01-26 06:54, Steve Summit wrote:
Stephen Colebourne wrote:
Because applications have APIs that they want to continue to support in a backwards compatible way. Even if everything were deprecated today, those APIs would need to be supported for at least 10 years and probably more.
When and how forcefully to deprecate something really is one of the key questions here. But I think a big part of the disconnect is that, in many people's minds, the isdst-related portions of the TZ API have already been pretty severely deprecated for at least 10 years (maybe even 20).
Who has been deprecating which parts of the ABI/API where?
So, quite aside from all the discussions over the specific Ireland case, it's worth asking: If the TZ project isn't already deprecating timezone and tm_isdst, should it be, and how strongly, and using what language in which documents?
And what about POSIX? Is there any way of getting them to deprecate those inadequate old interfaces, and perhaps standardize tm_gmtoff while they're at it?
Do we know how many programs are using timezone and tm_isdst (and the variable that says "this zone observes DST sometimes", and the other difficult API bits we've been discussing here)?
Github search result counts for some indications: [tm_isdst fork:true] Repositories Code 12K Commits 38K Issues 679 Topics Wikis 22 Users [tm_gmtoff fork:true] Repositories Code 1K Commits 11K Issues 224 Topics Wikis Users [tm_zone fork:true] Repositories 3 Code 846 Commits 13K Issues 132 Topics Wikis 2 Users [timezone fork:true] Repositories 6K Code 103K Commits 1M Issues 114K Topics 35 Wikis 13K Users 27 Languages 2,046 JavaScript 654 Ruby 648 Python 520 PHP 302 Java 268 Swift 144 HTML 124 C++ 111 CSS 102 Shell
Do we know why they think they have to use them? Is there guidance we can give them to help wean them off them?
Provide internationalized date and time interfaces that don't require changing and reading struct tm fields to do complex date arithmetic, as in other languages that support different calendars, concepts, units, and formats: Julian to properly handle O.S. info, a few Islamic to handle Ramadan easily, Chinese and Indian variations for those billions, Hebrew to handle those global communities. Internationalize tzdb so it can deal with other calendars mentioned and their variations as easily as it can Gregorian, or switch the support base to a language which can, and provide POSIX interfaces to a subset of the function, non-POSIX interfaces for the rest. Provide interfaces to handle different time scales easily: TAI, TT, UT1, JD/MJD, NTP, GPS, Loran (aka right/), with or without leap seconds decided at run time; and support time stamps down to at least the attosecond: in the 20th century, folks used to get by with just ns, now clock_getres(3) is getting lower in the ps; industries could be traded down to pennies in ns now; some Asian networks are providing multi-GB/s home links; femtoseconds are used in production engineering specs, but long run times may be required for accurate characterizations; should the subsecond part use long long or long double? Or don't and leave the existing interfaces alone? Someone said time isn't easy: no kidding! ;^> -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
Brian Inglis wrote:
On 2018-01-26 06:54, Steve Summit wrote:
...in many people's minds, the isdst-related portions of the TZ API have already been pretty severely deprecated for at least 10 years (maybe even 20).
Who has been deprecating which parts of the ABI/API where?
I said "In many people's minds", so I concede that to some extent I'm talking here about subjective reality, not objective fact. But it has seemed pretty obvious (and for a very long time) that several pieces of the traditional, SysV, Posix time interface were either useless, or broken, or both. I'll explain what I mean, but I'm reasonably sure this is not just one man's opinion: it is very much along the lines of what (for example) Robert Elz and Guy Harris have been arguing in this thread; it's what Eric Raymond argues at http://www.catb.org/esr/time-programming/, etc. If you're trying to convert back and forth between machine-readable timestamps (i.e. time_t) and human-readable times, in C, these functions are useful: localtime, gmtime, mktime, strftime These functions and global variables are either not useful, or downright broken: ctime, asctime, timezone, tzname This function, though not standard, is so necessary that it ought to be standard, and one shouldn't feel too bad about using it even though it's not: timegm And then there are the struct tm members. Most of them are useful. But tm_isdst is not remotely useful, for reasons which I think have been adequately exposed in this thread (but I can elaborate if necessary). And, finally, the common struct tm extensions, tm_gmtoff and tm_zone, are so necessary that, again, I wouldn't fault anyone (no matter how zealous about writing portable, standards-compliant code) for using them if available. The notions that timezone, tzname, and tm_isdst are not useful, and that functional invocations of localtime, gmtime, and strftime are much more useful, are coupled with a parallel dichotomy between two separate models for how to think about time and time zones at all. Should we represent a time zone simply as a base offset, and a "dst" offset, and a pair of names, and rules for deciding which (two) times during a year to switch to and from the "dst" offset? Or should we use a mapping -- an arbitrarily complex one -- between UTC and local time, with the offset and the name both being multivariate functions of the time being converted? (Again, the evident superiority of one model should be reasonably obvious based on this thread, but there may be more to say in separate messages.)
Brian Inglis wrote:
On 2018-01-26 06:54, Steve Summit wrote:
Do we know how many programs are using timezone and tm_isdst (and the variable that says "this zone observes DST sometimes", and the other difficult API bits we've been discussing here)?
Github search result counts for some indications: [details omitted]
Interesting. Thanks.
Do we know why they think they have to use them? Is there guidance we can give them to help wean them off them?
Provide internationalized date and time interfaces that don't require changing and reading struct tm fields to do complex date arithmetic... Internationalize tzdb so it can deal with other calendars... Provide interfaces to handle different time scales easily... support time stamps down to at least the attosecond...
All worthy goals, but I was more simply asking: For those programs still using tm_isdst (and similar/related interfaces): what are they using them for, and what can we offer them instead?
From: Stephen Colebourne <scolebourne@joda.org> Date: Fri, 26 Jan 2018 13:03:59 +0000 Subject: Re: [tz] [english 100%] Re: [english 100%] Re: OpenJDK/CLDR/ICU/Joda issues with Irelandchange | On 26 January 2018 at 07:15, Robert Elz <kre@munnari.oz.au> wrote: | > Why do we need all this? | Because applications have APIs that they want to continue to support | in a backwards compatible way. That's not useful - who uses those APIs, and for what purpose? | Just because zic/posiz doesn't expose data doesn't mean | its not useful. Nor does exposing it make it useful. The question was whether it has any real uses or not, not whether there possibly could be. | TZDB has known about the Ireland issue since 2005. To me personally, the Ireland issue is almost irrelevant - it simply exposed limitations in users of the tzdata that we didn't know about. A much bigger one is the "just two names" (which is enforced by the API apparently exposing a boolean as the selector, or so it seems from what we have been told). That is completely broken. Personally I think the wole notion of "standard" time that applies for less than half the year, and "other time" which applies for the rest is just plain silly. If CLDR was using names to index rather than "is the offset 0", it would both be able to access more than 2 names, and it would not care whether transitions that put the clocks forward come logically before of after transitions that put the clocks back. They're all just transitions that are altering the offset from UTC. Everything beyond that is just soeone's imagination. kre
Stephen Colebourne wrote:
All for something where the zic binary output doesn't even care which way the source is defined!!!!
That's not correct. The proposed change alters the zic binary output for Europe/Dublin, and this affects (for example) the output of 'zdump -i Europe/Dublin' so that Irish Standard Time is marked as standard time not DST. If the zic binary output didn't care, the change would not be needed.
On 2018-01-26 00:15, Robert Elz wrote:
From: Meno Hochschild <mhochschild@gmx.de> Date: Fri, 26 Jan 2018 05:49:44 +0100 Subject: Re: [tz] [english 100%] Re: [english 100%] Re: OpenJDK/CLDR/ICU/Joda issues with Irelandchange Why do we need all this? That is, what end-user real applications actually use any of this data, what do they use it for, and what do they really need (or want) ?
2) we display this in our UI. Why? Because it is available. If you stopped displaying it, would it matter? Users would notice the difference and complain. Would it actually affect how your application works, or how the users use it? Not really, no.
If (2) there ends with a "yes" answer, then exactly how the data is used, what its needed for, and what would break without it (this is assuming it is all working perfectly, ignore what happens if things change and cause errors, for now anyway) is what would be useful to know.
Maybe it will turn out that all of this really is important (to at least some class of end users and the apps they use) in which case we need to go ahead and find solutions to the problems that are known to exist now. I don't think I have ever seen one. Ever. But of course, I don't have experience with *everything* that exists (nor even most of it) so I might just have missed something, somewhere.
Embedded devices sold outside the US or native English speaking world, especially the EU, CN; like: mobiles, tablets, e-readers, non-MS servers and user desktops, control system front ends; international ecommerce and other applications and websites like social media, app stores, banking, and travel: rental vehicle booking systems, hotel reservation systems, airlines, cruise lines, travel agencies; nothing important to anyone ;^> It's probably easier (and cheaper!) to (un-)patch the tzdb to conform to downstream compatibility requirements, than it would be to change all downstreams to conform to tzdb theoretical requirements, until those capabilities are actually required somewhere in the real world. Let's face it, politicians and bureaucrats are just not as creative as nature's production of better fools for testing our fool-proof systems, towards improvement of which our main effort, time, and money should be directed. ;^> We should also bear in mind the IETF's belief in rough consensus and running code, and hum while reviewing RFC7282 and RFC6557/BCP175 ;^> -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
From: Brian Inglis <Brian.Inglis@SystematicSw.ab.ca> Date: Fri, 26 Jan 2018 10:10:07 -0700 Subject: Re: [tz] [english 100%] OpenJDK/CLDR/ICU/Joda issues with Ireland change | Embedded devices sold outside the US or native English speaking world, | especially the EU, CN; like: mobiles, tablets, e-readers, non-MS servers [...] That's just a list of things that would use time and internationalised interfaces. That's not at all useful, or even relevant. What I was asking for is an example of something which uses these time zone name strings for something that matters. And what that use actually is. That is: a real demonstrated example, not a "well there must be something" answer. I live (and have for many years now) in exactly that part of the world, and I've never seen aything that uses the time zone labels, other than stuff like the unix date command, which displays the abbreviation, just because it always has ... not for any particularly good reason. Certainly (and particularly since it gets shoved in before the year, scripts that parse the output from date expect it to be there - the year that follows is clearly useful (even though these days it would be more common to use date's strftime interface and get values that way) so we cannot simply delete it - but we could make it always be XXX or something, and aside from looking a bit ugly, I suspect no-one would really care. (Or we could always replace it with the numeric offset instead, for all zones.) Once again, someone, point me at something that would actually break if this was to be done? We have numeric (+NN etc) values as the abbreviation in quite a few zones now, and no-one has been moaning about their applications failing because of it, or not that I have seen. | It's probably easier (and cheaper!) to (un-)patch the tzdb to conform to | downstream compatibility requirements, than it would be to change all | downstreams to conform to tzdb theoretical requirements, until those | capabilities are actually required somewhere in the real world. That is definitely the wrong approach. That's the "let's bury our heads in the sand and hope it never happens" solution - which is guaranteed to lead to havoc, when (which it seems is really in the past anyway) it does happen. Much better to commence the updates now, while there is less time pressure, and so more time to plan a suitable conversion strategy, then to just do nothing (for now) and hope. kre
Meno Hochschild wrote:
I have adjusted my tzdb-compiler (Time4J) such that it looks if all rule lines referencing the same name (here: Eire) contain any negative dst offsets. If yes then let's assume summer time for SAVE=0 and winter time for SAVE < 0. So I can still work with old unchanged CLDR-entries for getting "Irish Standard Time" in summer.
This sounds like a good idea regardless of whether we make changes to zic input. Couldn't OpenJDK do the same? I'm somewhat leery of changing zic input format to address this problem, particularly if a simpler workaround is available.
On 26 January 2018 at 07:46, Paul Eggert <eggert@cs.ucla.edu> wrote:
Meno Hochschild wrote:
I have adjusted my tzdb-compiler (Time4J) such that it looks if all rule lines referencing the same name (here: Eire) contain any negative dst offsets. If yes then let's assume summer time for SAVE=0 and winter time for SAVE < 0. So I can still work with old unchanged CLDR-entries for getting "Irish Standard Time" in summer.
This sounds like a good idea regardless of whether we make changes to zic input. Couldn't OpenJDK do the same?
Such an approach is merely adding an even more subtle API to TZDB, one where a mixture of positive and negative SAVE values would cause chaos.
I'm somewhat leery of changing zic input format to address this problem, particularly if a simpler workaround is available.
Meno's approach with an extra column actually tackles the heart of the problem by providing a stable key that can be used to link the two projects. This decouples TZDB from CLDR. TZDB can make its change, while CLDR can still refer to CMT as standard and IST as daylight (as required by CLDR compatibility). A downstream parser would have to note the negative SAVE and META information, and re-calculate the raw offset. A pain, but do-able. And META is the only way to handle ramadan-like changes. Stephen
Stephen Colebourne wrote:
On 26 January 2018 at 07:46, Paul Eggert <eggert@cs.ucla.edu> wrote:
Meno Hochschild wrote:
I have adjusted my tzdb-compiler (Time4J) such that it looks if all rule lines referencing the same name (here: Eire) contain any negative dst offsets. If yes then let's assume summer time for SAVE=0 and winter time for SAVE < 0. So I can still work with old unchanged CLDR-entries for getting "Irish Standard Time" in summer.
This sounds like a good idea regardless of whether we make changes to zic input. Couldn't OpenJDK do the same?
Such an approach is merely adding an even more subtle API to TZDB, one where a mixture of positive and negative SAVE values would cause chaos.
What sort of chaos, exactly? Meno Hochschild is not reporting any chaos.
Meno's approach with an extra column actually tackles the heart of the problem by providing a stable key that can be used to link the two projects.
Such a key could be given in a separate data file, which could be used by zi parsers that do not support negative DST offsets, or that have other specialized requirements. That would be less disruptive than changing zic input format for purposes unrelated to zic. For this particular issue I'm hoping that we can use an even less-disruptive approach, such as the one proposed here: https://mm.icann.org/pipermail/tz/2018-January/026002.html which should avoid the need for an extra column or an extra file.
@Paul Eggert When I elaborated my workaround for v2018b (which works fine now) I have made at least the assumption that a rule set consisting of rule lines having the same name will not contain a mixture of both negative and positive dst offsets. Otherwise my approach will probably be broken. Let's imagine that Ireland will one day start to consider the winter time as standard and rename both winter and summer time accordingly. Would the tzdb maintainers then "reuse" the "Eire"-rules for the new positive dst offset? I hope not and ask if a new ruleset with a different new name can be taken into consideration. Can I rely on that? By the way: I discovered that the current practice in Java is broken for Ireland in the years 1968-71 where OpenJDK just prints "Greenwich Mean Time" althoug it should be read as "Irish Standard Time". My adjusted tz-compiler has finally coped with the right naming using the version v2018b but would be broken again with new version v2018c (and Java remains broken here for the years 1968-71 in Ireland). So the reverted change in v2018c is not really an improvement (and for me even worse). The new version v2018c is only good for OpenJDK when handling Ireland now in year 2018. Meno Am 26.01.2018 um 16:51 schrieb Paul Eggert:
Stephen Colebourne wrote:
On 26 January 2018 at 07:46, Paul Eggert <eggert@cs.ucla.edu> wrote:
Meno Hochschild wrote:
I have adjusted my tzdb-compiler (Time4J) such that it looks if all rule lines referencing the same name (here: Eire) contain any negative dst offsets. If yes then let's assume summer time for SAVE=0 and winter time for SAVE < 0. So I can still work with old unchanged CLDR-entries for getting "Irish Standard Time" in summer.
This sounds like a good idea regardless of whether we make changes to zic input. Couldn't OpenJDK do the same?
Such an approach is merely adding an even more subtle API to TZDB, one where a mixture of positive and negative SAVE values would cause chaos.
What sort of chaos, exactly? Meno Hochschild is not reporting any chaos.
Meno's approach with an extra column actually tackles the heart of the problem by providing a stable key that can be used to link the two projects.
Such a key could be given in a separate data file, which could be used by zi parsers that do not support negative DST offsets, or that have other specialized requirements. That would be less disruptive than changing zic input format for purposes unrelated to zic.
For this particular issue I'm hoping that we can use an even less-disruptive approach, such as the one proposed here:
https://mm.icann.org/pipermail/tz/2018-January/026002.html
which should avoid the need for an extra column or an extra file.
On 01/26/2018 02:15 PM, Meno Hochschild wrote:
Let's imagine that Ireland will one day start to consider the winter time as standard and rename both winter and summer time accordingly. Would the tzdb maintainers then "reuse" the "Eire"-rules for the new positive dst offset? I hope not and ask if a new ruleset with a different new name can be taken into consideration. Can I rely on that?
I'm afraid not, as that restriction is not in the code or the documentation. Also, I don't see why the restriction would help; it seems a bit arbitrary.
By the way: I discovered that the current practice in Java is broken for Ireland in the years 1968-71 where OpenJDK just prints "Greenwich Mean Time" althoug it should be read as "Irish Standard Time". My adjusted tz-compiler has finally coped with the right naming using the version v2018b but would be broken again with new version v2018c (and Java remains broken here for the years 1968-71 in Ireland). So the reverted change in v2018c is not really an improvement (and for me even worse).
Yes, sorry about that. I'm hoping to come up with a scheme that will support both old-style (2017c and earlier) and new-style (2018a and 2018b) approaches soon. As usual I'll publish proposed patches before distributing a new release, and I hope you'll try them out.
The new version v2018c is only good for OpenJDK when handling Ireland now in year 2018.
Yes, this is a known issue with CLDR, discussed here: https://mm.icann.org/pipermail/tz/2018-January/025974.html which says that CLDR doesn't worry about timestamps before 1990.
The new version v2018c is only good for OpenJDK when handling Ireland now in year 2018.
Yes, this is a known issue with CLDR, discussed here:
u=https-3A__mm.icann.org_pipermail_tz_2018-2DJanuary_025974.html&d=DwICaQ&c=jf_iaSHvJObTbx-
siA1ZOg&r=sE8ucIDOUlZUTL1mVdiOnoXkknyh5kabG5yfwpgfi10&m=RgeB4ArWFe2W4Qo489WjPusbCh0yqUEkcYac204zDSA&s=m9GGwtksBqM_tnPSgu6QD8CC5eLjL_2JUUDfJc-
IlFA&e=
which says that CLDR doesn't worry about timestamps before 1990.
Well, above statement is not accurate. CLDR does not provide any zone names only used before 1990. If name is not available, CLDR specification (LDML) suggests CLDR data consumers to use UTC offset format as the fallback. So, program utilizing CLDR data should still produce accurate timestamp, but just not using zone name. OpenJDK situation is slightly different. Basically, OpenJDK retrofit CLDR data partially and use the set of current names only. As far as I know, JDK does not support multiple sets of names for a single tz database zone. For example, America/Indiana/Knox: ==== # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone America/Indiana/Knox -5:46:30 - LMT 1883 Nov 18 12:13:30 -6:00 US C%sT 1947 -6:00 Starke C%sT 1962 Apr 29 2:00 -5:00 - EST 1963 Oct 27 2:00 -6:00 US C%sT 1991 Oct 27 2:00 -5:00 - EST 2006 Apr 2 2:00 -6:00 US C%sT ==== With Java, formatting date on Jan 1 2000 and 2010, format date should be in EST, while latter date is in CST. However, my understanding is that Java only use the current name set (in this case, US Central Time), Java date formatter prints out "Central Standard Time" for both date. Example 1 (TimeZone and DateFormat): TimeZone tzKnox = TimeZone.getTimeZone("America/Indiana/Knox"); GregorianCalendar cal = new GregorianCalendar(); cal.setTimeZone(tzKnox); DateFormat fmt = DateFormat.getDateTimeInstance(DateFormat.FULL, DateFormat.FULL, Locale.US); fmt.setTimeZone(tzKnox); cal.clear(); cal.set(2000, Calendar.JANUARY, 1); String dstr2000 = fmt.format(cal.getTime()); cal.set(2010, Calendar.JANUARY, 1); String dstr2010 = fmt.format(cal.getTime()); System.out.println(dstr2000); System.out.println(dstr2010); Example 2 (java.time): ZoneId tzKnox = ZoneId.of("America/Indiana/Knox"); DateTimeFormatter formatter = DateTimeFormatter.ofPattern("EEEE, MMMM d, y 'at' h:mm:ss a zzzz").withZone(tzKnox); ZonedDateTime d2000 = ZonedDateTime.of(2000, 1, 1, 0, 0, 0, 0, tzKnox); String dstr2000 = d2000.format(formatter); ZonedDateTime d2010 = ZonedDateTime.of(2010, 1, 1, 0, 0, 0, 0, tzKnox); String dstr2010 = d2010.format(formatter); System.out.println(dstr2000); System.out.println(dstr2010); Both example prints out: Saturday, January 1, 2000 at 12:00:00 AM Central Standard Time Friday, January 1, 2010 at 12:00:00 AM Central Standard Time Although, the date on Jan 1, 2000 should be Saturday, January 1, 2000 at 12:00:00 AM Eastern Standard Time -Yoshito
On 01/26/2018 04:10 PM, Yoshito Umaoka wrote:
The new version v2018c is only good for OpenJDK when handling Ireland now in year 2018.
Yes, this is a known issue with CLDR, discussed here:
https://mm.icann.org/pipermail/tz/2018-January/025974.html
which says that CLDR doesn't worry about timestamps before 1990.
Well, above statement is not accurate. CLDR does not provide any zone names only used before 1990.
Thanks for correcting that. I should have written "time zone names before 1990" instead of "timestamps before 1990". This correction doesn't solve Meno Hochschild's problem, unfortunately.
About the wish for the restriction that a set of rule lines with same name should not have mixed signs of dst offsets, I will try to explain. The whole thing is about labelling what the zero dst offset stands for. If such a set of rule lines only contains either positive or zero SAVE then the zero dst offset corresponds to winter time (default). If the set of rules only contains negative or zero SAVE then the zero dst offset corresponds to summer time (Eire). But if we have both negative and positive and zero offsets then how to label the zero dst offset? A human being might guess it by looking at the year context, but I am not sure if a machine tool can do it, too. Would be a hard nut to crack. By the way, I don't mind if my suggestion of introducing an optional META column at the end of a rule line might be modified such that we get a new file with similar content (with the advantage not to change the zic input format). And even then I would prefer not to have mixed signs within the same set of rules equally named. It makes programming much easier. And I also believe that such a new file is not increasing the maintenance burden so much because a case like Ireland is rather rare (okay and maybe also Egypt or Morocco with ramadan time or historic double dst offset in some countries shortly after second world war). With best regards Meno Am 26.01.2018 um 23:25 schrieb Paul Eggert:
On 01/26/2018 02:15 PM, Meno Hochschild wrote:
Let's imagine that Ireland will one day start to consider the winter time as standard and rename both winter and summer time accordingly. Would the tzdb maintainers then "reuse" the "Eire"-rules for the new positive dst offset? I hope not and ask if a new ruleset with a different new name can be taken into consideration. Can I rely on that?
I'm afraid not, as that restriction is not in the code or the documentation. Also, I don't see why the restriction would help; it seems a bit arbitrary.
By the way: I discovered that the current practice in Java is broken for Ireland in the years 1968-71 where OpenJDK just prints "Greenwich Mean Time" althoug it should be read as "Irish Standard Time". My adjusted tz-compiler has finally coped with the right naming using the version v2018b but would be broken again with new version v2018c (and Java remains broken here for the years 1968-71 in Ireland). So the reverted change in v2018c is not really an improvement (and for me even worse).
Yes, sorry about that. I'm hoping to come up with a scheme that will support both old-style (2017c and earlier) and new-style (2018a and 2018b) approaches soon. As usual I'll publish proposed patches before distributing a new release, and I hope you'll try them out.
The new version v2018c is only good for OpenJDK when handling Ireland now in year 2018.
Yes, this is a known issue with CLDR, discussed here:
https://mm.icann.org/pipermail/tz/2018-January/025974.html
which says that CLDR doesn't worry about timestamps before 1990.
Meno Hochschild wrote:
I would prefer not to have mixed signs within the same set of rules equally named.
Wouldn't that in some cases force rulesets to be split into multiply-named Rules merely to satisfy the restriction, thus complicating Zones that refer to the affected rulesets? That doesn't sound like a good restriction to add, at least from the tzcode/tzdata point of view.
Well, if an extra file contains an information per rule line how to interprete SAVE=0 (in case of Ireland: grasping the daylight-entry of CLDR which points to the label "Irish Standard Time", default: grasping the standard-entry of CLDR) for correct localized labelling then it will probably be fine to live without any restrictions so multiply-named rules can be avoided. What about a new file which actually just contains a copy of Eire rules with additional META column containing key-value pairs in format "season=W" (winter) or "season=S" (summer)? Here stable documented keys and values are important, of course. The new file can be safely ignored by zic but help other tz-compilers to easily determine localized labels of timezones in unchanged CLDR-entries. I just try to find a compromise with minimalistic impact on all sides. Am 28.01.2018 um 01:32 schrieb Paul Eggert:
Meno Hochschild wrote:
I would prefer not to have mixed signs within the same set of rules equally named.
Wouldn't that in some cases force rulesets to be split into multiply-named Rules merely to satisfy the restriction, thus complicating Zones that refer to the affected rulesets? That doesn't sound like a good restriction to add, at least from the tzcode/tzdata point of view.
From: Meno Hochschild <mhochschild@gmx.de> Date: Sun, 28 Jan 2018 05:17:37 +0100 Subject: Re: [tz] [english 100%] Re: OpenJDK/CLDR/ICU/Joda issues with Ireland change | Here stable documented keys and values are important, of course. The key only exists to allow for some other (currently unknown) data to be added in a similar way sometime in the future, doing it that way might be a good idea, or if we cannot think of anything else we're likely to want to add, it could just be omitted. If it exists, then yes, for this purpose its value would be set in stone, and never change. The values, however, not so much - what would be needed are agreed values with CLDR for CLDR to use - but from time to time we'd need to be able to add new ones (and like everything else time related, sometimes very quickly - so long grace periods are not always possible, however desirable they might be). I would assume that you could handle that by simply advising those who use the data to use the numeric offset (converted to a string) if one of these values is not found (if new tzdata has added a new one, and updated CLDR has not been released with the appropriate new data added yet). I would suggest not using 1 char values however (at least, not generally) or someone will start assuming they always must be, and complain when a longer one appears! Just allow for arbitrary non-whitespace strings (perhaps alphanumeric (and _) only to avoid someone deciding that "X>" would be nice to use). | The new file can be safely ignored | by zic but help other tz-compilers to easily determine localized labels | of timezones in unchanged CLDR-entries. | I just try to find a compromise with minimalistic impact on all sides. I would not worry about any of that yet - first order of business should be working out what is needed - what data would be useful for tzdata to make available that would allow CLDR, and then its clients, to operate better, and without assumptions. Once we know that, we can work out how best to implement it, and how to make the results visible. Those are much less important right now. There is no huge urgency here, we have time to do it properly. kre
From: Meno Hochschild <mhochschild@gmx.de> Message-ID: <15c91fab-ca48-d6bd-f961-2b7ce00e7b15@gmx.de> Date: Sat, 27 Jan 2018 21:49:27 +0100 Subject: Re: [tz] [english 100%] Re: OpenJDK/CLDR/ICU/Joda issues with Ireland change | About the wish for the restriction that a set of rule lines with same | name should not have mixed signs of dst offsets, I will try to explain. | The whole thing is about labelling what the zero dst offset stands for. The issue is that it does not (by itself) "stand for" anything. This is one of the (invalid) assumptions that people tend to make, probably because it has mostly seemed to be universally true in (most of) the US, and in Europe, and yes, in Australia, and of course in those places that have only ever had one zone offset since standardised time was adopted. But jurisdictions are free to (without implementing any kind of annual time shift, ala summer time or daylight savings) alter their timezone (that is, shift their clocks relative to UTC) any time they like, as often as they like, and to anything they like, and can alter the name they call their time (assuming that it is called anything other than "the time") any time they like as well (either at the same time as a change of offset occurs, or just at some random time for whatever reason appeals.). There is simply no one name, or UTC offset, for "standard" or "alternate" time in any timezone - attempting to force one is simply wrong. | If such a set of rule lines only contains either positive or zero SAVE | then the zero dst offset corresponds to winter time (default). If the | set of rules only contains negative or zero SAVE then the zero dst | offset corresponds to summer time (Eire). It one could assume that the world was nice and simple as that suggests, then that might work - but it isn't. What will you do when some country (other than Eire which is already sane, it seems) decides that it is really crazy for the "standard" time to only apply 5 months (approx) of the year, and for the other time (daylight savings or summer time, or whatever it is called locally) to operate for the other 7 months, and redefine standard time to be the longer period, and the other time (probably with a different name, perhaps winter time) to the shorter period. Then we have from 19xx (whenever summer time was introduced first, or 1970 or 1990 if there is an epoch we do not consider before) until 2020 we have standard time offset = NN00 and summer time MM00 (where MM is NN+1 probably) and after 2020 we have standard time is MM00 and winter time is NN00. That is 1 hour positive DST until 2020 and 1 hour negative after that. And what if winter time after 2020 is offset PP00 where PP == NN - 1, or perhaps that change only happens in 2023, or perhaps differently, that becomes "mid-winter" time and only applies for 1 month in the middle of winter with normal winter time applying for the other 4 cooler/cold months (2 either side of mid winter time). Everyone needs to plan to cope with things like this - before some legislature decides to specify it. | By the way, I don't mind if my suggestion of introducing an optional | META column at the end of a rule line might be modified ... I have no problem with that - though a "meta" column that can contain almost any random extra data might be a bit much. I think something that provided a better linkage to CLDR than what we now have would be a good idea. But CLDR (and its clients) need to accept that doing that will mean altering they way they work - no more assumptions that there are just two "kinds" of time in one zone, no assumptions about relative offsets, of that standard time has some constant offset from UTC. All of that has to go. How the implementation happens is really better discussed after a suitable API is agreed. That is whether the new data goes in the zoneinfo files, or elsewhere - the former would make more sense, for most zones there will not be a lot of it, all it really needs is a short int index in each ttinfo, and yet another list of strings). These could probably even be included with the list of zone name abbreviation strings. Similarly what the tzdb src file format would look like. | And even then I would prefer not to have mixed signs | within the same set of rules equally named. It makes programming much | easier. If you are doing any programming at all where it matters, you are programming the wrong thing (except possibly for uses like dumping a timezone like zdump does, but with the internationalised names included). Other that semi-diagnostic (and just "explain how wacky things are to people") uses, code should never care. If it does, it is almost certainly assuming something that is not correct. kre
On 28 Jan 2018, at 05:34, Robert Elz <kre@munnari.OZ.AU> wrote:
From: Meno Hochschild <mhochschild@gmx.de> Message-ID: <15c91fab-ca48-d6bd-f961-2b7ce00e7b15@gmx.de> Date: Sat, 27 Jan 2018 21:49:27 +0100 Subject: Re: [tz] [english 100%] Re: OpenJDK/CLDR/ICU/Joda issues with Ireland change
| About the wish for the restriction that a set of rule lines with same | name should not have mixed signs of dst offsets, I will try to explain. | The whole thing is about labelling what the zero dst offset stands for.
The issue is that it does not (by itself) "stand for" anything.
This is one of the (invalid) assumptions that people tend to make, probably because it has mostly seemed to be universally true in (most of) the US, and in Europe, and yes, in Australia, and of course in those places that have only ever had one zone offset since standardised time was adopted.
But jurisdictions are free to (without implementing any kind of annual time shift, ala summer time or daylight savings) alter their timezone (that is, shift their clocks relative to UTC) any time they like, as often as they like, and to anything they like, and can alter the name they call their time (assuming that it is called anything other than "the time") any time they like as well (either at the same time as a change of offset occurs, or just at some random time for whatever reason appeals.).
There is simply no one name, or UTC offset, for "standard" or "alternate" time in any timezone - attempting to force one is simply wrong.
This isn’t just some theoretical possibility either. In Europe, with its reasonably stable rules, Portugal has changed timezone twice within my memory: in 1992 and 1996. Looking at the tz data, it has changed twice more in my lifetime: in 1966 and 1976. For the earlier two changes the clocks changed and the isdst flag didn’t (Portugal didn’t do DST then). For the later two the clocks didn’t change but the isdst flag did. As Robert Elz says, almost anything is possible. Peter Ilieve
As Robert Elz says, almost anything is possible.
True, but you can't design software and data to handle everything. Some government could decide that every other day the clock is to move forward by the number of minutes in that day of each Hindi month. Doesn't mean that we should design for that now. The TZDB group has done a masterful job over the years, and we all owe a debt of gratitude to all of the people who have spent so much time on it. But like many, many systems, its very success puts constraints on it. There are so many changes I wish we could make in Unicode, for example, but even if "formally speaking" we could make those changes, practically speaking they would be massively disruptive. Clearly sometimes you have to make disruptive changes, but any such change should have very, very clear ROI that strongly outweighs the disruption. The change to negative offsets is massively disruptive, and has no significant functional difference I can see; functionally it is purely the TZDB abbreviations. The TZDB should not concern itself with names, since they always have to be localized for mere humans. So frankly, I see no reasons for ever having SAVE<0. But let's suppose that there are. In that case, I see no option other than issuing both data files that are strictly SAVE≥0 and data file(s) that no such restriction (or the logical equivalent to dual files) until such time as we have confidence that the number of clients broken by this change becomes very small. === In the meantime, let me try to set out some of the things that CLDR does and doesn't do, since people have questions about that. Some is a recap of what others have said. The architecture is built to supply a variety of different kinds of names for timezones, in all of our languages, to try to meet local expectations. It does not do calculations of time offsets; it depends on clients using the tzdb for that. The linkage is supplied by the tzid (eg America/Los_Angeles). (As has been seen in these threads, the clients do the calculations in very different ways, and are updated on very different schedules. It is quite customary in some clients for the tzdb to be updated regularly, but use an old version of code and and old version CLDR data.) Some of the following is a simplified description, in cases where I think the complications are not particularly relevant. - CLDR is targeted at names in locales: by language, sometimes also different by script (where multiple scripts are common for the language), sometimes also different by country (where there are local differences). - There are multiple kinds of names based on input parameters, with a fallback mechanism if a name is missing. Worst case, you get an offset from GMT/UTC (localized, eg "ጂ ኤም ቲ-0300") - For a given locale, the names must be unambiguous. For example, the term X can't be used *in the same locale, in the same timespan *as a name for both a tzid with UTC+3 and a tzid with UTC-6. - *metazones:* these provide a mechanism for sharing names across tzids. - Logically the lookup is roughly: <tzid> → name; if name = ∅, then <datetime, tzid> → metazone → name - The same tzid can be in different metazones over time. - For some more details, see below. - *abbreviated: *we support abbreviations, but only supply abbreviations for a locale if they would be both customary and unambigous in the target locale (often different by country). - *generic:* - id="generic" is most used for a recurring events. "Every Thursday in 2018 at 14:00 Pacific Time", referring to wall time in the metazone America_Pacific. - *summer vs winter* - id="standard" is the ID used for SAVE=0. - id="daylight" is the ID used for SAVE>0. Only present where the metazone/tzid has such a contrast. - These are just IDs, the names will be in different languages, and can be /winter/ and /summer/, or something else: whatever makes the most sense to users of that locale. For example, for English, we have daylight → "Irish Standard Time" - There is a current restriction to 2 fields. It would be relatively easy for us to add an additional field, which we would do if some modern locale had something like "double-daylight" time. It might take a long time for clients to support that, however. *Metazones* These provide a level of indirection. "America_Pacific" is a metazone that is currently shared by "America/Tijuana", "America/Vancouver", ... The mapping depends on a date-time span + tzid. For example: <timezone type="America/Bahia_Banderas"> <usesMetazone to="1970-01-01 08:00" mzone="America_Pacific"/> <usesMetazone to="2010-04-04 09:00" from="1970-01-01 08:00" mzone="America_Mountain"/> <usesMetazone from="2010-04-04 09:00" mzone="America_Central"/> </timezone> When a name is looked up, we first lookup by tzid, then by metazone. So for a given loc, a particular tzid's names can be overridden to be different from the shared metazone's names. Mark On Sun, Jan 28, 2018 at 4:06 AM, Peter Ilieve via tz <tz@iana.org> wrote:
On 28 Jan 2018, at 05:34, Robert Elz <kre@munnari.OZ.AU> wrote:
From: Meno Hochschild <mhochschild@gmx.de> Message-ID: <15c91fab-ca48-d6bd-f961-2b7ce00e7b15@gmx.de> Date: Sat, 27 Jan 2018 21:49:27 +0100 Subject: Re: [tz] [english 100%] Re: OpenJDK/CLDR/ICU/Joda issues with Ireland change
| About the wish for the restriction that a set of rule lines with same | name should not have mixed signs of dst offsets, I will try to explain. | The whole thing is about labelling what the zero dst offset stands for.
The issue is that it does not (by itself) "stand for" anything.
This is one of the (invalid) assumptions that people tend to make, probably because it has mostly seemed to be universally true in (most of) the US, and in Europe, and yes, in Australia, and of course in those places that have only ever had one zone offset since standardised time was adopted.
But jurisdictions are free to (without implementing any kind of annual time shift, ala summer time or daylight savings) alter their timezone (that is, shift their clocks relative to UTC) any time they like, as often as they like, and to anything they like, and can alter the name they call their time (assuming that it is called anything other than "the time") any time they like as well (either at the same time as a change of offset occurs, or just at some random time for whatever reason appeals.).
There is simply no one name, or UTC offset, for "standard" or "alternate" time in any timezone - attempting to force one is simply wrong.
This isn’t just some theoretical possibility either. In Europe, with its reasonably stable rules, Portugal has changed timezone twice within my memory: in 1992 and 1996. Looking at the tz data, it has changed twice more in my lifetime: in 1966 and 1976. For the earlier two changes the clocks changed and the isdst flag didn’t (Portugal didn’t do DST then). For the later two the clocks didn’t change but the isdst flag did.
As Robert Elz says, almost anything is possible.
Peter Ilieve
Mark Davis ☕️ wrote:
The change to negative offsets ... has no significant functional difference I can see; functionally it is purely the TZDB abbreviations.
That would be true were it not for tm_isdst. Although tm_isdst is a poorly-designed interface, it is standardized by ISO C and by POSIX and it is not likely to go away soon.
I see no option other than issuing both data files that are strictly SAVE≥0 and data file(s) that no such restriction (or the logical equivalent to dual files) until such time as we have confidence that the number of clients broken by this change becomes very small.
Yes, this very much seems to be the only way forward, if we want to address the problem. I have something in mind and will try to publish details soon.
Meno Hochschild said:
@Paul Eggert When I elaborated my workaround for v2018b (which works fine now) I have made at least the assumption that a rule set consisting of rule lines having the same name will not contain a mixture of both negative and positive dst offsets. Otherwise my approach will probably be broken.
In that case your approach is broken. While I'm not aware of anywhere that has shifts both backwards and forwards from "standard time", I have no faith that politicians won't do it. For example, imagine an Islamic country decides to move forwards an hour to be in line with a major trading neighbour. So their standard time is UTC+3 and their summer time is UTC+4. But for Ramadan they still need to be close to local solar time, so that's UTC+2. Yes, this is a hypothetical, but it's not completely implausible. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
Meno Hochschild wrote:
I have adjusted my tzdb-compiler (Time4J) such that it looks if all rule lines referencing the same name (here: Eire) contain any negative dst offsets. If yes then let's assume summer time for SAVE=0 and winter time for SAVE < 0. So I can still work with old unchanged CLDR-entries for getting "Irish Standard Time" in summer.
This sounds like a good idea regardless of whether we make changes to zic input. Couldn't OpenJDK do the same? I'm somewhat leery of changing zic input format to address this problem, particularly if a simpler workaround is available.
I'm also not sure any changes in zic input format are necessary. BTW, what Meno explained above is exactly what we're worrying about. If CLDR project decides to swap standard/daylight names at some point, Meno's code will be broken. His code is not affected even saving amount is negative, but it's still sensitive to how CLDR identifies Irish Standard Time. -Yoshito
Sure, my workaround will be broken if CLDR project decides just to swap standard/daylight names but this will also break the main consumers of CLDR data, namely ICU and OpenJDK (at least current and past versions of actual software distributions). I cannot imagine that this is a serious way to go since (what I have understood until now) your team value backwards compatibility a lot. With best regards and the hope that CLDR data will not simply swap data in reverse Meno Am 26.01.2018 um 18:18 schrieb Yoshito Umaoka:
Meno Hochschild wrote:
I have adjusted my tzdb-compiler (Time4J) such that it looks if all rule lines referencing the same name (here: Eire) contain any negative dst offsets. If yes then let's assume summer time for SAVE=0 and winter time for SAVE < 0. So I can still work with old unchanged CLDR-entries for getting "Irish Standard Time" in summer.
This sounds like a good idea regardless of whether we make changes to zic input. Couldn't OpenJDK do the same? I'm somewhat leery of changing zic input format to address this problem, particularly if a simpler workaround is available.
I'm also not sure any changes in zic input format are necessary.
BTW, what Meno explained above is exactly what we're worrying about. If CLDR project decides to swap standard/daylight names at some point, Meno's code will be broken. His code is not affected even saving amount is negative, but it's still sensitive to how CLDR identifies Irish Standard Time.
-Yoshito
On Jan 25, 2018, at 1:40 PM, Meno Hochschild <mhochschild@gmx.de> wrote:
In order to bridge the gap between the offset information in TZDB and the simple scheme generic/standard/daylight in CLDR, would it not be a good enhancement to TZDB source format to add an extra column per rule line to flag if the dst offset (equal if zero, negative or positive) has to be considered as associated with standard time labelling? Then the consumer would not have to reason about (dstOffset == 0) <=> isStandardTimeFormat() as actually done in Java software. Such a column could even be enhanced by an extra state for the ramadan situation in some Arabic countries when the clock only temporarily switches back (enabling a better name). The actual column containing an abbreviation is not really clear about this name-offset-association IMHO and can therefore not be evaluated by source code based tz-compilers.
...combined with a tool that translates the new tzdb files into "old-style" tzdb files for use by programs other than zic, to stave off "butbutbutbutbut that'll break XXX!" complaints about changes to the .tz file format. (And, were we to extend the .tz file format, perhaps replacing the abbreviations in the "FORMAT" column with an identifier that refers to a new type of line in the file; the new type of line could give abbreviations and English-language long names for all N of the different time types in that time zone? We could call those, oh, say, "metazones" or something such as that. :-))
On 24 January 2018 at 18:44, Guy Harris <guy@alum.mit.edu> wrote:
On Jan 23, 2018, at 11:55 AM, Yoshito Umaoka <yoshito_umaoka@us.ibm.com> wrote:
CLDR does not determine offsets.
Stephen Colebourne claimed that CLDR determines whether to use the standard or daylight time strings by comparing the "raw offset" (presumably meaning "the offset during standard time") with the "actual offset" (presumably meaning "the offset during daylight savings time").
raw-offset = the base/standard offset from TZDB (GMTOFF) actual-offset = the actual offset a person sees at a given instant (GMTOFF + SAVE) Saying (raw-offset == actual-offset) is the same as saying (SAVE == 0).
Therefore, it *must* know those offsets, otherwise it cannot compare them.
So let me rephrase the question:
How does CLDR obtain those offsets?
Mostly answered by Yoshito/Mark. - TZDB provides data on how offsets change over time, indicating a base/raw/standard offset and an adjustment/SAVE when DST applies. - CLDR provides data on zone names, keyed by "generic", "standard", "daylight". - For Ireland, CLDR states that "standard" = winter = "Greenwich Mean Time" - For Ireland, CLDR states that "daylight" = summer = "Irish Standard Time" - Some piece of code has to decide whether to pick the "standard" or "daylight" CLDR key based on the TZDB data. - ICU & OpenJDK both parse the source TZDB files (as data is lost in the conversion to binary) - ICU & OpenJDK use the mechanism to pick the key, using (raw-offset = actual-offset) to indicate "standard". - For Ireland, TZDB currently indicates (raw-offset = actual-offset) in winter - For Ireland, TZDB is proposing to indicate (raw-offset = actual-offset) in summer To adjust to the Ireland proposal, ICU & OpenJDK code (and all similar code) would need to handle negative SAVE values. The evidence so far is that task is not complex, and already works in many cases. To adjust to the Ireland proposal, CLDR would have to change the text associated with the keys "standard" and "daylight" to the opposite of what they are today. Therefore, there are 8 possible combinations to consider: - new code, new TZDB, new CLDR - works fine - new code, new TZDB, old CLDR - wrong names - new code, old TZDB, new CLDR - wrong names - new code, old TZDB, old CLDR - works fine - old code, new TZDB, new CLDR - code may fail - old code, new TZDB, old CLDR - code may fail & wrong names - old code, old TZDB, new CLDR - wrong names - old code, old TZDB, old CLDR - works fine All of these combinations are possible to create in the wild. It is not possible to ensure that only a working combination exists (especially considering the old code cases). Of the four cases where TZDB changes, 3 result in failure. And note that this only discusses one piece of new code. In reality, ICU, OpenJDK, Joda-Time, ThreeTen-Backport, Android and other libraries all exist. Each of these can be in new-code vs old-code form, so instead of this being 8 combinations, it could easily be 16, 32, 64 or more. Perhaps now, readers can see why I say this is not just a code bug that can be fixed. It is the interplay between old and new versions of code and data that makes the change impossible. (It simply isn't possible to update everything in lock-step). Finally, the Ireland situation has been known about in TZDB since 2005: https://github.com/eggert/tz/blob/master/europe#L316 Common sense prevailed back then, with the SAVE value remaining positive. (the zic binary output doesn't care whether SAVE is positive or negative other than the tm_isdst flag which everyone here seems to think is an anachronism in zic). Stephen
On Jan 25, 2018, at 3:25 AM, Stephen Colebourne <scolebourne@joda.org> wrote:
- CLDR provides data on zone names, keyed by "generic", "standard", "daylight".
- For Ireland, CLDR states that "standard" = winter = "Greenwich Mean Time" - For Ireland, CLDR states that "daylight" = summer = "Irish Standard Time"
OK, so: 1. The ISO C API (localtime()) has a notion of "Daylight Saving Time" being in effect or not in effect, which controls whether tm_isdst is set or not. "Daylight Saving Time" presumably refers to turning clocks forward. Neither C90, C99, nor C11 use the term "standard time" anywhere that I can see, so there does not seem to be any problem with "Daylight Saving Time" being "standard time". I don't see a problem with ISO C setting tm_isdst to 1 for Irish Standard Time. 2. The Single UNIX Specification inherits the above from the ISO C standard. It also defines a variable named timezone, which "shall be set to the difference, in seconds, between Coordinated Universal Time (UTC) and local standard time." That *does* raise the question of what "local standard time" is for Ireland - is it the time that prevails when "summer time" isn't in effect, or is it the time that Irish law refers to as standard time, as the two are different? Its description of the TZ environmental variable says that the POSIX syntax is: The expanded format (for all TZ s whose value does not have a <colon> as the first character) is as follows: stdoffset[dst[offset][,start[/time],end[/time]]] Where: std and dst Indicate no less than three, nor more than {TZNAME_MAX}, bytes that are the designation for the standard (std) or the alternative (dst -such as Daylight Savings Time) timezone. Only std is required; if dst is missing, then the alternative time does not apply in this locale. which suggests that Irish Standard Time wouldn't be local standard time in Ireland, GMT/UTC would be local standard time in the POSIX sense. If so, that means that there wouldn't be a problem with timezone being set to 0 for Europe/Dublin. Should the POSIX folk be asked for an interpretation here? (And, while we're at it, should we ask them what "local standard time" refers to, given that a given location may choose to switch from one time zone to another?) 3. The CLDR says that "standard" means "winter" and "daylight" means "summer": http://cldr.unicode.org/translation/timezones so, for the CLDR, Irish Standard Time isn't standard time. 4. The ICU C++ API's TimeZone class: http://icu-project.org/apiref/icu4c/classicu_1_1TimeZone.html speaks of "Daylight Saving Time", which presumably refers to turning clocks forward. 5. The Java SE 7 TimeZone class: https://docs.oracle.com/javase/7/docs/api/java/util/TimeZone.html seems to say the same thing. 6. The ICU Java API's TimeZone class: http://icu-project.org/apiref/icu4j/com/ibm/icu/util/TimeZone.html also speaks of "standard" and "daylight", so Irish Standard Time is presumably daylight time rather than standard time. So all those appear to be assuming that, if there's any (semi-)regular time adjustment during the year, it involves turning the clocks forward some time in spring or summer, switching to "daylight" or "Daylight Saving" or "Daylight Savings" time, and turning the clocks backward some time in fall/autumn or winter, switching to what some but not all call "standard" time. For now, that suggests Europe/Dublin should have an offset of 0 from UTC and with the current winter time abbreviation being "GMT" or whatever's appropriate and the current summer time abbreviation being "IST". I.e., leave it alone for now. Perhaps they all need to make a definitive statement about what "daylight" time means, and, if they offer some notion of a "standard" offset from UTC, what the "standard" offset should be for Ireland. (Then they need to be asked what to do about tzdb regions that move from one time zone to another, changing what would presumably be deemed the standard offset.) If the don't choose to say "sorry, Irish Standard Time isn't the standard time in Ireland", they may need to introduce new APIs to get information about "the *real* standard time" as opposed to APIs that get information about "the time that's what you have when you're not in daylight/Daylight Saving/Daylight Savings/summer time". Providing *that* information in the tzdb would probably involve changes to the source and binary formats.
Guy Harris said:
1. The ISO C API (localtime()) has a notion of "Daylight Saving Time" being in effect or not in effect, which controls whether tm_isdst is set or not.
Right.
"Daylight Saving Time" presumably refers to turning clocks forward.
No. That's not stated anywhere in ISO C - it's an assumption you've made. Given that, the rest of our argument falls apart. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On Jan 26, 2018, at 9:34 PM, Clive D.W. Feather <clive@davros.org> wrote:
Guy Harris said:
1. The ISO C API (localtime()) has a notion of "Daylight Saving Time" being in effect or not in effect, which controls whether tm_isdst is set or not.
Right.
"Daylight Saving Time" presumably refers to turning clocks forward.
No. That's not stated anywhere in ISO C - it's an assumption you've made.
OK, so what C90 says is ...Daylight Saving Time, which is a temporary change in the algorithm for determining local time. The local time zone and Daylight Saving Time are implementation-defined. So, in the Republic of Ireland this still allows "Irish Standard Time" to be considered "Daylight Saving Time", as well as allowing "Greenwich Mean Time" to be considered "Daylight Saving Time", given that the time changes twice a year, rendering both changes "temporary". (Perhaps the next C standard should just call it "Some Other Time".) Thus, I *still* don't see a problem with ISO C if tm_isdst is set to 1 for Irish Standard Time, and the conclusion in my item 1 in the message to which you're replying is unchanged.
Guy Harris said:
Guy Harris said:
1. The ISO C API (localtime()) has a notion of "Daylight Saving Time" being in effect or not in effect, which controls whether tm_isdst is set or not.
Right.
"Daylight Saving Time" presumably refers to turning clocks forward.
No. That's not stated anywhere in ISO C - it's an assumption you've made.
OK, so what C90 says is
...Daylight Saving Time, which is a temporary change in the algorithm for determining local time. The local time zone and Daylight Saving Time are implementation-defined.
So, in the Republic of Ireland this still allows "Irish Standard Time" to be considered "Daylight Saving Time", as well as allowing "Greenwich Mean Time" to be considered "Daylight Saving Time", given that the time changes twice a year, rendering both changes "temporary".
Indeed. I agree with all that.
Thus, I *still* don't see a problem with ISO C if tm_isdst is set to 1 for Irish Standard Time,
There isn't. But there also isn't a problem if it is set to 0 for Irish Standard Time.
and the conclusion in my item 1 in the message to which you're replying is unchanged.
The first sentence (beginning "The ISO C API") is fine. The second sentence (including "presumably") is entirely your assumption and is *NOT* supported anywhere in ISO C. That's my point. Since most of the rest of your argument was predicated on it, that all falls. [I'm not taking part in this discussion for a bit - I've going to be in four TZDB zones in the next 36 hours, one currently on "standard" time, two with no bi-annual changes, and one currently on "summer" time, not necessarily in that order.] -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On Jan 27, 2018, at 10:33 PM, Clive D.W. Feather <clive@davros.org> wrote:
There isn't. But there also isn't a problem if it is set to 0 for Irish Standard Time.
Fine. So, in the real world, how much software would fail if Irish Standard Time weren't considered "daylight saving time" in the sense of "tm_isdst is set to 1"? Perhaps they *shouldn't* have made any assumptions about what "Daylight Saving Time" meant, but, in practice, I suspect that rather a lot of people think "Daylight Saving Time", or "Daylight Savings Time", or "summer time", means "the clocks get turned forwards". If the clocks don't get turned forwards, I suspect calling that "Daylight Saving Time" would be a Bad Idea, no matter *what* the ISO C standard allows.
On 2018-01-27 05:34, Clive D.W. Feather wrote:
Guy Harris said:
"Daylight Saving Time" presumably refers to turning clocks forward. No. That's not stated anywhere in ISO C - it's an assumption you've made.
What about a dictionary? Random House: daylight saving the practice of advancing standard time by one hour in the spring of each year and of setting it back by one hour in the fall in order to gain an extra period of daylight during the early evening. Collins: daylight-saving time time set usually one hour ahead of the local standard time, widely adopted in the summer to provide extra daylight in the evening. Also called (in the US) daylight time See also British Summer Time The current documentation of the tzdb interfaces agrees with these definitions (except for the restriction of the amount to 1 h in Random House). From (at least) 1993-01 until 2018c, inclusive, it says that the dst bit of tzdb indicates "summer time"; the claim "but we always meant dst to indicate non-standard time rather than summer time" is not tenable. Of course, tzdb may use the term "daylight-saving time" in their own specific meaning -- but this should be clearly stated in the documentation. And whether a change in the meaning of one common term (as opposed to, for instance, a new term and a clear-cut interface change) is a helpful upgrade path for the tzdb customers is not clear to me. Michael Deckers.
Michael H Deckers via tz wrote:
And whether a change in the meaning of one common term
In the past, tzdb commentary has been contradictory: it has attempted to support both the idea that standard time is the standard civil time used in a locale, and the idea that daylight saving time is advanced compared to standard time. Unfortunately, taken together these two ideas are contradicted by the Irish data, and so one of the two must give. No matter which idea gives, there will be "a change in the meaning of one common term" used in the commentary; this is unfortunate but cannot be avoided. We'll just have to define what tzdb means by its use of these terms from here on out, and move on.
The Random House dictionary you quote does not support your hypothesis, indeed it seems to contradict it, since it allows for both forward and backward movement of the clock to count as daylight saving: daylight saving the practice of advancing ...: and of setting it back ... by one hour ... in order to gain an extra period of daylight during the early evening. Regards, Malcolm -----Original Message----- From: tz [mailto:tz-bounces@iana.org] On Behalf Of Michael H Deckers via tz Sent: Sunday, 28 January, 2018 17:28 To: Clive D.W. Feather <clive@davros.org>; Guy Harris <guy@alum.mit.edu> Cc: Stephen Colebourne <scolebourne@joda.org>; Time Zone Mailing List <tz@iana.org> Subject: [External] Re: [tz] OpenJDK/CLDR/ICU/Joda issues with Ireland change On 2018-01-27 05:34, Clive D.W. Feather wrote:
Guy Harris said:
"Daylight Saving Time" presumably refers to turning clocks forward. No. That's not stated anywhere in ISO C - it's an assumption you've made.
What about a dictionary? Random House: daylight saving the practice of advancing standard time by one hour in the spring of each year and of setting it back by one hour in the fall in order to gain an extra period of daylight during the early evening. Collins: daylight-saving time time set usually one hour ahead of the local standard time, widely adopted in the summer to provide extra daylight in the evening. Also called (in the US) daylight time See also British Summer Time The current documentation of the tzdb interfaces agrees with these definitions (except for the restriction of the amount to 1 h in Random House). From (at least) 1993-01 until 2018c, inclusive, it says that the dst bit of tzdb indicates "summer time"; the claim "but we always meant dst to indicate non-standard time rather than summer time" is not tenable. Of course, tzdb may use the term "daylight-saving time" in their own specific meaning -- but this should be clearly stated in the documentation. And whether a change in the meaning of one common term (as opposed to, for instance, a new term and a clear-cut interface change) is a helpful upgrade path for the tzdb customers is not clear to me. Michael Deckers. This email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please delete all copies and notify the sender immediately. You may wish to refer to the incorporation details of Standard Chartered PLC, Standard Chartered Bank and their subsidiaries at http://www.standardchartered.com/en/incorporation-details.html Where you have a Financial Markets relationship with Standard Chartered PLC, Standard Chartered Bank and their subsidiaries (the "Group"), information on the regulatory standards we adhere to and how it may affect you can be found in our Regulatory Compliance Statement on https://www.sc.com/rcs/ and Regulatory Compliance Disclosures on http://www.sc.com/rcs/fm Insofar as this communication contains any market commentary, the market commentary has been prepared by sales and/or trading desk of Standard Chartered Bank or its affiliate. It is not and does not constitute research material, independent research, recommendation or financial advice. Any market commentary is for information purpose only and shall not be relied for any other purpose, and is subject to the relevant disclaimers available at https://www.sc.com/en/banking-services/market-disclaimer.html Insofar as this e-mail contains the term sheet for a proposed transaction, by responding affirmatively to this e-mail, you agree that you have understood the terms and conditions in the attached term sheet and evaluated the merits and risks of the transaction. We may at times also request you to sign on the term sheet to acknowledge in respect of the same. Please visit https://www.sc.com/en/banking-services/dodd-frank-disclosures.html for important information with respect to derivative products.
On Jan 28, 2018, at 1:58 PM, Wallace, Malcolm via tz <tz@iana.org> wrote:
The Random House dictionary you quote does not support your hypothesis, indeed it seems to contradict it, since it allows for both forward and backward movement of the clock to count as daylight saving:
daylight saving the practice of advancing ...: and of setting it back ... by one hour ... in order to gain an extra period of daylight during the early evening.
Then they need to clarify the definition: the practice of advancing standard time by one hour in the spring of each year and of setting it back by one hour in the fall in order to gain an extra period of daylight during the early evening *of the spring and summer*. Note also that this is a definition of "daylight saving", not of "daylight saving time"; the practice of "daylight saving" involves turning the clock forward when the extra period of daylight during the early evening is desired and turning it backward again when it's not desired - as opposed, for example to just saying "what the hell" and turning it forward once and leaving it there. "Daylight saving time" is the time that's in effect when the clock has been turned forward. And the online OED says (this is the US flavor of the OED, although they claim the first use was in Adelaide, which I'm guessing is the Adelaide in Australia): https://en.oxforddictionaries.com/definition/us/daylight_saving "A method of securing longer evening daylight during the summer by setting the clocks ahead of standard time, typically by one hour; the period during which this is in force." and https://en.oxforddictionaries.com/definition/us/daylight_saving_time "Time as adjusted to achieve longer evening daylight in summer by setting the clocks an hour ahead of the standard time." (That is explicitly noted as a North American usage; perhaps the Australians were the first to refer to "daylight saving", and the Yanks were the first to refer to the time when "daylight saving" was being done as "daylight saving{s} time").
Guy Harris <guy@alum.mit.edu> wrote on Sun, 28 Jan 2018 at 14:15:11 -0800 in <0B1729A4-454C-43A6-BC7F-9B99C0677122@alum.mit.edu>:
And the online OED says (this is the US flavor of the OED, although they claim
No. Oxford University Press publishes many dictionaries. THe OED is one of many, and you have not cited it. ("oxforddictionaries.com" is confusing. It is apparently an amalgam of the Oxford Dictionary of English, New Oxford American Dictionary, and Oxford Thesaurus of English; see https://en.oxforddictionaries.com/help). These other dictionaries lack the persuasive power and stature of the OED. But be forewarned -- trying to use the OED to win an argument is generally a bad approach. If anyone ever used a word in a particular way (and chances are they have), you an often find support for it in the OED. That says very little about the legitimacy of such a usage. Citing the OED is a good way to fool people into thinking that some particular usage is more canon than it really is. Also, the OED is a good dictionary for usage prior to the current pair of decades. But if you want modern or recent usage, there are many better choices (I personally prefer the American Heritage). With that pedantry out of the way, it turns out the OED's definition of "daylight saving" (http://www.oed.com/view/Entry/401483?rskey=aH523Y&result=1#eid, subscription required) exactly matches oxforddictionaries.com (this sometimes happens). Not true for daylight saving time (compound C2), where it is: "daylight saving time n. (also daylight savings time) time as adjusted during the summer to achieve longer evening daylight, by setting the clocks ahead of standard time, typically by one hour; the period during which this is in force; cf. summertime n. 2." The OED, as its forte, also has six quotations for daylight saving from 1908-2004, and four for daylight saving time from 1908-2009. One hyphenates daylight-saving and another capitalizes Daylight Saving Time. There is also compound C1, for "daylight saving" generally, as used in "daylight saving bill" and "daylight saving legislation" with another four quotations. In any case, I don't think these are particularly good sources for this kind of question. p.s.: the Adelaide 1908 quotation is: 1908 Register (Adelaide) 27 June 11/6 (heading) Daylight saving seriously discussed. --jhawk@mit.edu John Hawkinson
the first use was in Adelaide, which I'm guessing is the Adelaide in Australia):
https://en.oxforddictionaries.com/definition/us/daylight_saving
"A method of securing longer evening daylight during the summer by setting the clocks ahead of standard time, typically by one hour; the period during which this is in force."
and
https://en.oxforddictionaries.com/definition/us/daylight_saving_time
"Time as adjusted to achieve longer evening daylight in summer by setting the clocks an hour ahead of the standard time." (That is explicitly noted as a North American usage; perhaps the Australians were the first to refer to "daylight saving", and the Yanks were the first to refer to the time when "daylight saving" was being done as "daylight saving{s} time").
John Hawkinson wrote:
In any case, I don't think these are particularly good sources for this kind of question.
I tend to agree.
p.s.: the Adelaide 1908 quotation is:
1908 Register (Adelaide) 27 June 11/6 (heading) Daylight saving seriously discussed.
Although that quote is from an Australian newspaper, it's in an article headed "NOTES FROM LONDON / [From our Special Correspondent]" and datelined London, 1908-05-22. As such, I expect it's using English terminology rather than Australian. There's a 2015-08-08 comment about this in the "europe" file. The article also cites opinions of two ex-Astronomers Royal for Ireland. We cannot seem to escape the Irish question.... You can read the article here: https://trove.nla.gov.au/newspaper/page/4436792 Winston Churchill puts in a cameo appearance, in a story about him almost falling off his horse as he ogles a lady wearing a directoire dress <https://rbkclibraries.wordpress.com/2013/07/12/margaine-lacroix-and-the-dres...>.
On 01/25/2018 03:25 AM, Stephen Colebourne wrote:
Therefore, there are 8 possible combinations to consider:
- new code, new TZDB, new CLDR - works fine - new code, new TZDB, old CLDR - wrong names - new code, old TZDB, new CLDR - wrong names - new code, old TZDB, old CLDR - works fine
- old code, new TZDB, new CLDR - code may fail - old code, new TZDB, old CLDR - code may fail & wrong names - old code, old TZDB, new CLDR - wrong names - old code, old TZDB, old CLDR - works fine
This list of combinations is assuming a particular transition method. lf we use a different method we can avoid the failures that it mentions. One approach is to have an intermediate CLDR that is compatible with both the old and the new TZDB. I just now suggested something along those lines here: https://mm.icann.org/pipermail/tz/2018-January/026002.html Perhaps that idea wouldn't work as-is, but I'm sure that something along those lines would work.
Yoshito Umaoka said:
CLDR does not determine offsets. CLDR just maintains an array of names by category. In CLDR, we define several different type of names for a zone (and localized names in various locales) -
1. Long standard (e.g. Pacific Standard Time) 2. Long daylight (e.g. Pacific Daylight Time) 3. Long generic (e.g. Pacific Time) 4. Short standard (e.g. PST) 5. Short daylight (e.g. PDT) 6. Short generic (e.g. PT)
What about the name for the third offset each year? The UK used to use three offsets during the year. I'm sure it was not alone. I'm certainly not sure that it won't happen again. What if the same offset has different names in different contexts? A majority-Muslim country that puts its clocks back for Ramadan (I believe such exist) might use the names XXX Winter Time, XXX Summer Time, and XXX Ramadan Time, the last to make it clear that it's not because of winter. If your answer is "we'll deal with that when it happens" then, well, it's happened.
And the set of name may change time to time for a single location.
But then you say:
CLDR sets an assumption that name of zones are very stable. For example, "Pacific Standard Time" represents standard time used on US Pacific coast and the name itself does not change time to time.
Within my lifetime "BST" has been both Short Daylight and Short Standard (in your terminology) for my timezone. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On 2018-01-25 15:57:11 (+0000), Clive D.W. Feather wrote:
Yoshito Umaoka said:
CLDR does not determine offsets. CLDR just maintains an array of names by category. In CLDR, we define several different type of names for a zone (and localized names in various locales) -
1. Long standard (e.g. Pacific Standard Time) 2. Long daylight (e.g. Pacific Daylight Time) 3. Long generic (e.g. Pacific Time) 4. Short standard (e.g. PST) 5. Short daylight (e.g. PDT) 6. Short generic (e.g. PT)
What about the name for the third offset each year? The UK used to use three offsets during the year. I'm sure it was not alone. I'm certainly not sure that it won't happen again.
What if the same offset has different names in different contexts? A majority-Muslim country that puts its clocks back for Ramadan (I believe such exist) might use the names XXX Winter Time, XXX Summer Time, and XXX Ramadan Time, the last to make it clear that it's not because of winter.
If your answer is "we'll deal with that when it happens" then, well, it's happened.
At least Egypt and Morrocco have done this since 1990 (the arbitrary time before which CLDR considers irrelevant by policy). Trivial to find sources: https://en.wikipedia.org/wiki/Daylight_saving_time_in_Egypt https://en.wikipedia.org/wiki/Daylight_saving_time_in_Morocco Winter is hardly an appropriate description for these times of the year in those countries. (In fact, the whole rationale for messing with time like this is because summer overlapping with Ramadan presents certain challenges). Philip -- Philip Paeps Senior Reality Engineer Ministry of Information
On Jan 25, 2018, at 7:57 AM, Clive D.W. Feather <clive@davros.org> wrote:
What about the name for the third offset each year? The UK used to use three offsets during the year. I'm sure it was not alone. I'm certainly not sure that it won't happen again.
So if we have three flavors of time per year, with different names/abbreviations/offsets, that would require a POSIX API change, and changes to other time-related APIs that have an assumption that, within a year, there is, at most, winter time and summer time.
On 01/25/2018 01:34 PM, Guy Harris wrote:
if we have three flavors of time per year, with different names/abbreviations/offsets, that would require a POSIX API change
It doesn't require an API change, as the GNU C library, FreeBSD, and other libraries conform to POSIX in this area while still supporting applications that successfully use and deal with three or more flavors of time per year. This works because the parts of the POSIX API that you're referring to are vestigial: although they suffice for many usages, they're inadequate for older timestamps and in some cases they're even inadequate for this years' planned timestamps. General-purpose POSIX applications cannot rely on the vestigial parts of the POSIX API: instead, when querying TZ settings they must stick to a subset of the POSIX API (the "non-vestigial" part) that works even when the implementation is configured to use Morocco time, or Los Angeles time including the 2007 DST rule changes, or whatever. This non-vestigial API subset includes calls to the localtime and strftime functions and invocations of the 'date' command; it excludes vestigial interfaces such as the 'timezone', 'daylight', and 'tzname' variables, all of which are supported by glibc etc. as per POSIX and all of which a general-purpose application should not use.
On Jan 25, 2018, at 4:03 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 01/25/2018 01:34 PM, Guy Harris wrote:
if we have three flavors of time per year, with different names/abbreviations/offsets, that would require a POSIX API change
It doesn't require an API change, as the GNU C library, FreeBSD, and other libraries conform to POSIX in this area while still supporting applications that successfully use and deal with three or more flavors of time per year.
This works because the parts of the POSIX API that you're referring to are vestigial: although they suffice for many usages, they're inadequate for older timestamps and in some cases they're even inadequate for this years' planned timestamps. General-purpose POSIX applications cannot rely on the vestigial parts of the POSIX API: instead, when querying TZ settings they must stick to a subset of the POSIX API (the "non-vestigial" part) that works even when the implementation is configured to use Morocco time, or Los Angeles time including the 2007 DST rule changes, or whatever. This non-vestigial API subset includes calls to the localtime and strftime functions and invocations of the 'date' command; it excludes vestigial interfaces such as the 'timezone', 'daylight', and 'tzname' variables, all of which are supported by glibc etc. as per POSIX and all of which a general-purpose application should not use.
So presumably this means that: 1) applications should not require a knowledge of "the" offset from UTC (given that there may not be any such thing as "the" offset from UTC), e.g. don't use timezone; 2) applications that need to know offset from UTC at any given instance should do so by doing strftime(buf, bufsize, "%z", tm) and parsing the result; 3) applications that need to know the the time zone abbreviation at any given instance should do so by doing strftime(buf, bufsize, "%Z", tm); 4) applications should not require a knowledge whether daylight savings time (or whatever it's called) has ever been, or will ever be, in effect in the current locale/tzdb region. What, in real-world terms, can they assume about tm_isdst? That it indicates some either implementation-specified or unspecified notion of "Daylight Saving{s} Time", which might or might not be set e.g. during a Ramadan time shift? Or that it's true iff the time is shifted from some implementation-specified or unspecified notion of "standard time" (which might not, in the Republic of Ireland, correspond to Irish Standard Time)?
On 01/25/2018 04:52 PM, Guy Harris wrote:
1) applications should not require a knowledge of "the" offset from UTC (given that there may not be any such thing as "the" offset from UTC), e.g. don't use timezone;
Yes.
2) applications that need to know offset from UTC at any given instance should do so by doing strftime(buf, bufsize, "%z", tm) and parsing the result;
Sort of. That method works only if the UTC offset is a multiple of 1 minute. This is a safe assumption for modern civil timestamps, but if you want to go back before 1972 in tzdb, or if you want to support odd-but-valid POSIX TZ strings, a more-reliable approach is to call localtime and gmtime on the same timestamp, and subtract the results by hand with the usual Gregorian rules. (And yes, that's what some applications do - of course it's much easier and faster to use tm_gmtoff when available.)
3) applications that need to know the the time zone abbreviation at any given instance should do so by doing strftime(buf, bufsize, "%Z", tm);
Yes.
4) applications should not require a knowledge whether daylight savings time (or whatever it's called) has ever been, or will ever be, in effect in the current locale/tzdb region.
Yes.
What, in real-world terms, can they assume about tm_isdst? That it indicates some either implementation-specified or unspecified notion of "Daylight Saving{s} Time", which might or might not be set e.g. during a Ramadan time shift? Or that it's true iff the time is shifted from some implementation-specified or unspecified notion of "standard time" (which might not, in the Republic of Ireland, correspond to Irish Standard Time)?
Portable code shouldn't use tm_isdst; it's a vestigial interface. Alas, too many programs do use it, I think mostly because programmers not-unnaturally think it must exist for a reason.
On 01/25/2018 08:05 PM, Paul Eggert wrote:
On 01/25/2018 04:52 PM, Guy Harris wrote:
2) applications that need to know offset from UTC at any given instance should do so by doing strftime(buf, bufsize, "%z", tm) and parsing the result;
Sort of. That method works only if the UTC offset is a multiple of 1 minute. This is a safe assumption for modern civil timestamps, but if you want to go back before 1972 in tzdb, or if you want to support odd-but-valid POSIX TZ strings, a more-reliable approach is to call localtime and gmtime on the same timestamp, and subtract the results by hand with the usual Gregorian rules. (And yes, that's what some applications do - of course it's much easier and faster to use tm_gmtoff when available.)
I apologize for going further off topic, but I've long had a question about this. Rather than requiring applications to "call localtime and gmtime ... and subtract the results by hand", why couldn't the function(s) used by tzset for the tm_gmtoff value be made available for applications? That is, offer an API to get the offset directly from TZ/tzfile, instead of relying on complex and imprecise math (e.g., the glibc comment: it's OK to assume that A and B are close to each other). I just recently had to add this tm_diff() math to util-linux, it seems unnecessary to require this duplication. BTW, I think the information in the message that I'm replying to is very useful. Perhaps it should be documented in the glibc manual and elsewhere?
J William Piggott wrote:
Rather than requiring applications to "call localtime and gmtime ... and subtract the results by hand", why couldn't the function(s) used by tzset for the tm_gmtoff value be made available for applications?
You don't need functions. All you need is tm_gmtoff; it's simple and easy to explain, and is a widely-used extension to POSIX.
Perhaps it should be documented in the glibc manual and elsewhere?
The glibc manual already documents tm_gmtoff, as do manuals for other operating systems that have it. Unfortunately tm_gmtoff is not standardized by C or POSIX, perhaps because standardizers mistakenly thought that strftime %z was enough.
From: Paul Eggert <eggert@cs.ucla.edu> Date: Fri, 26 Jan 2018 07:39:30 -0800 Subject: Re: [tz] OpenJDK/CLDR/ICU/Joda issues with Ireland change | Unfortunately tm_gmtoff is not standardized by C or POSIX, perhaps | because standardizers mistakenly thought that strftime %z was enough. No, it would have been because tm_gmtoff isn't available everywhere (and most particularly, wasn't available on the main reference system from which most of POSIX was copied.) Standards groups should (and the posix people mostly do) specify what users can expect to have work, and how to make that happen. When things only work, or can only be used, sometimes, that's not suitable to be standardised. It is possible that after all of this time, enough of the world that matters has tm_gmtoff (and tm_zone) that POSIX may be persuaded to add them - but at best that could not happen until the next major revision, which is years away if I read the signs correctly. They could indicate that it will happen though (which also would give those implementations which still don't support those fields time to catch up.) I paricipate (a bit - mostly wrt the shell definition) in the group that does the work, I will (in a week or two, when I have systems back working that allow me to do it rationally) file a bug report with them, and see what happens (it is likely to garner some immediate reaction, but will take more than a year to reach the head of the queue and actually get some possible action, one way or the other). kre ps: I know that posix has not been immune from the need to invent, but they are better than many others in that regard. It is also possible that this may be regarded as more a C issue, and best deferred to them, in which case someone else would need to learn how to see if any action can be started there.
Robert Elz wrote:
From: Paul Eggert<eggert@cs.ucla.edu> Date: Fri, 26 Jan 2018 07:39:30 -0800 Subject: Re: [tz] OpenJDK/CLDR/ICU/Joda issues with Ireland change
| Unfortunately tm_gmtoff is not standardized by C or POSIX, perhaps | because standardizers mistakenly thought that strftime %z was enough.
No, it would have been because tm_gmtoff isn't available everywhere (and most particularly, wasn't available on the main reference system from which most of POSIX was copied.)
strftime %z wasn't available on the main reference system either, nor was it universally supported, and yet it was added to the C standard. Since tm_gmtoff and tm_zone are supported by GNU/Linux and by the BSDs, they're pretty much everywhere but in traditional APIs such as AIX and Solaris. The main objecting to adding them there, as I understand it, is that it requires changing the size of 'struct tm' and that this is more of hassle than adding a conversion spec to strftime. Unfortunately strftime "%z" doesn't suffice to determine the full UT offset. I used to be a regular contributor to POSIX standardization but dropped out due to lack of time. If there is a way I could contribute in the timestamp area (but not get deluged by other topics) I'd could resume.
Paul Eggert said:
| Unfortunately tm_gmtoff is not standardized by C or POSIX, perhaps | because standardizers mistakenly thought that strftime %z was enough.
No, it would have been because tm_gmtoff isn't available everywhere (and most particularly, wasn't available on the main reference system from which most of POSIX was copied.) strftime %z wasn't available on the main reference system either, nor was it universally supported, and yet it was added to the C standard.
I was involved in the C99 standardization work. We knew that there were lots of issues in this area and asked time experts (from memory, including people on this list) for proposals. Nothing arrived in the relevant timescales. I stopped being involved a few years later, so don't know what has happened since. But the WG14 I was involved in would welcome a proposal that would sort things out once and for all based on current knowledge. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On 01/26/2018 10:39 AM, Paul Eggert wrote:
J William Piggott wrote:
Rather than requiring applications to "call localtime and gmtime ... and subtract the results by hand", why couldn't the function(s) used by tzset for the tm_gmtoff value be made available for applications?
You don't need functions. All you need is tm_gmtoff; it's simple and easy to explain, and is a widely-used extension to POSIX.
You did not answer my question. Perhaps I didn't express it well. You just endorsed the diff(LOCALTIME, GMTIME) concept again in your update to the theory.html file. It is still being used in code that you maintain and code that I maintain. You advocate using tm_gmtoff as an alternate to diff(LOCALTIME, GMTIME), which implies that the algorithms that set tm_gmtoff are viable alternatives to diff(LOCALTIME, GMTIME). So my question is, why not get the offset directly from TZ/tzfile the same way tzset does for tm_gmtoff; instead of using the complex and imprecise math in diff(LOCALTIME, GMTIME)? Please don't be dismissive and say just use tm_gmtoff. The diff(LOCALTIME, GMTIME) concept is currently being used in parse_datetime.y, strftime_l.c, and so on. Plus you have endorsed it twice in as many days. I've already concluded that it could have been done using the tzset algorithms; I only wonder why it wasn't, and isn't now. If you are too busy to answer this question, that is fine, please don't.
Perhaps it should be documented in the glibc manual and elsewhere?
The glibc manual already documents tm_gmtoff, as do manuals for other operating systems that have it. Unfortunately tm_gmtoff is not standardized by C or POSIX, perhaps because standardizers mistakenly thought that strftime %z was enough.
I didn't suggest documenting tm_gmtoff. I suggested deprecating the vestigial API's and explaining their replacements. As you just did in the theory.html file. I find this very useful information, so thank you for taking time to document it. I think it would be wonderful if the same language could find its way into other documentation such as the glibc manual.
J William Piggott wrote:
why not get the offset directly from TZ/tzfile the same way tzset does for tm_gmtoff
Let me try to be clear about the question since I evidently didn't understand your previous email. When you write "get the offset directly from TZ/tzfile", I assume you don't mean that users are expected to write complex code that behaves like tzcode's localtime.c and that goes off and parses the TZ environment variable and/or reads files in tzfile format and get the offset directly from that string or data. Instead, I assume you are asking for an API that lets users easily determine the UTC offset of a timestamp. For tzcode, GNU/Linux, FreeBSD, etc., that API is already there: it's the tm_gmtoff member of struct tm. For POSIX there is no such API, so one must play the diff(LOCALTIME,GMTIME) trick. One could design a different API to do what tm_gmtoff does, an API that uses a function rather than a structure member, and implement this function by doing what tzset does and a little bit more. However, as far as I know, nobody has done that for C in any widely-used distribution, and there's been no need to do it because tm_gmtoff already handles the problems that the function would address.
On 2018-01-27 11:13, Paul Eggert wrote:
J William Piggott wrote:
why not get the offset directly from TZ/tzfile the same way tzset does for tm_gmtoff
Let me try to be clear about the question since I evidently didn't understand your previous email. When you write "get the offset directly from TZ/tzfile", I assume you don't mean that users are expected to write complex code that behaves like tzcode's localtime.c and that goes off and parses the TZ environment variable and/or reads files in tzfile format and get the offset directly from that string or data. Instead, I assume you are asking for an API that lets users easily determine the UTC offset of a timestamp.
For tzcode, GNU/Linux, FreeBSD, etc., that API is already there: it's the tm_gmtoff member of struct tm. For POSIX there is no such API, so one must play the diff(LOCALTIME,GMTIME) trick.
One could design a different API to do what tm_gmtoff does, an API that uses a function rather than a structure member, and implement this function by doing what tzset does and a little bit more. However, as far as I know, nobody has done that for C in any widely-used distribution, and there's been no need to do it because tm_gmtoff already handles the problems that the function would address.
NetBSD supports tzgetgmtoff(), tzgetname(), with tz and isdst parameters, since ~4.{2,3} according to ESR's doc. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
Brian Inglis wrote:
NetBSD supports tzgetgmtoff(), tzgetname(), with tz and isdst parameters, since ~4.{2,3} according to ESR's doc.
Thanks, I'd forgotten about those NetBSD functions. However, they do not suffice for tzdata, since they assume at most two types of time can exist in a timezone_t object, something that is true for the simple POSIX model but is false for tzdb. So code should not use these functions if it wants to work on arbitrary timestamps. This is why I didn't add these functions to tzcode when I added support for tzalloc, tzfree, etc. Also, even if one assumes POSIX, tzgetgmtoff and tzgetname are not much use on platforms like NetBSD that have tm_gmtoff and tm_zone.
From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 27 Jan 2018 18:22:08 -0800 Subject: Re: [tz] OpenJDK/CLDR/ICU/Joda issues with Ireland change | However, they do not suffice for tzdata, since they assume at | most two types of time can exist in a timezone_t object, Actually, they don't, if zic set the isdst field in the ttinfo to 0, 1, 2, 3, ... (allowed by POSIX< which just requires "positive" "negative" and "zero") then the functions would fetch the corresponding value (they don't really access tzname[] and timezone!) | So code should not use these functions if it wants to work on | arbitrary timestamps. But I agree with that, as there's no reason we cannot have isdst value 1 with various different gmt offsets at different periods (like when a zone has altered its base offset), and there's no way with those functions to specify which one is wanted, nor to doscover which was used for the result. | Also, even if one assumes POSIX, tzgetgmtoff and tzgetname | are not much use on platforms like NetBSD that have | tm_gmtoff and tm_zone. They're less than "not much use", they are useless. kre
On 2018-01-27 19:22, Paul Eggert wrote:
Brian Inglis wrote:
NetBSD supports tzgetgmtoff(), tzgetname(), with tz and isdst parameters, since ~4.{2,3} according to ESR's doc.
Thanks, I'd forgotten about those NetBSD functions. However, they do not suffice for tzdata, since they assume at most two types of time can exist in a timezone_t object, something that is true for the simple POSIX model but is false for tzdb. So code should not use these functions if it wants to work on arbitrary timestamps. This is why I didn't add these functions to tzcode when I added support for tzalloc, tzfree, etc.
Also, even if one assumes POSIX, tzgetgmtoff and tzgetname are not much use on platforms like NetBSD that have tm_gmtoff and tm_zone.
They are probably of advantage mainly to insulate developers porting apps supporting tz, whether or not their library or OS has struct tm or globals extensions. I appreciate the orthogonal architectural simplicity of BSD extensions supporting time from the machine to the application layers - [get][s][bin,nano,micro][up]time() etc.: that would be a nice API to add to standards, as would extensions to that API for DJB's libtai with attosecond precision, and similar type extensions to the rest of the standard API, perhaps using some mechanisms for type generic math to support type generic time. I am cautiously positive about future improvements in C, with its C11 addition of TIME_UTC and timespec_get(), implying other time bases, epochs, scales, or coordinates could be added by implementations, and to a future standard. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
From: Brian Inglis <Brian.Inglis@SystematicSw.ab.ca> Date: Sat, 27 Jan 2018 19:12:11 -0700 Subject: Re: [tz] OpenJDK/CLDR/ICU/Joda issues with Ireland change NetBSD supports tzgetgmtoff(), tzgetname(), with tz and isdst parameters, since ~4.{2,3} according to ESR's doc. It does (though I'm not sure what 4.2/3 you mean, unless that refers to CSRG BSD 4.2 (as in released in 1982 or whatever it was). But that predates the ado timezone code, and particularly, anything which has a tz parameter (which is even more recent) so that's unlikely. NetBSD itself had no 4.2 or 4.3 releases, there was just 4.0 and 4.0.1 before NetBSD 5. There was however a 1.4.2 and 1.4.3 back before the numbering scheme was altered after 1.6 - used to be 1.1 1.2 1.3 ... we switched to 2.0 3.0 ... (and minor relesaes for small updates and fixed.) They should be removed (and at the very least, never used) as the information they provide is unreliable (essentially useless) - eg: for CLDR use they would be incorrect, as they are very likely to return "the" GMT offset (or name) for a zone for sometime in the period before 1990, probably even before 1970. And of course with no indication available of where (as in which era) they came from. Like much other code related to time, they are making assumptions about how things work that do not meet real world practice. Everything that makes assumptions like that needs to vanish. kre
From: Stephen Colebourne <scolebourne@joda.org> Date: Tue, 23 Jan 2018 18:42:29 +0000 Subject: Re: [tz] OpenJDK/CLDR/ICU/Joda issues with Ireland change | Java time-zone data is updated using the tzupdater tool | [URL omitted here] | This will update the tzdb data, but not the CLDR-driven data that | drives the text. That is most probably a mistake - the two should be linked, it is entirely possible that a zone might change its names (regardless of issues of when transitions occur, or what, if anything, is regarded as the "standard" time). | Were the change to proceed, anyone running tzupdater | with the Ireland change would invert the meaning of inDaylightTime() | and access the wrong array element in the CLDR-driven data - a bug. Yes, it would be, and CLDR or java (whichever has the issue, or both) should fix it. And fix it soon. | And code changes don't help, as we'll see below. Of course code changes help - there's a bug, fixing the bug will fix that. And also of course, for people who don't update, the bug will continue to appear - as for any other bug, or security vunerability that is found and fixed. Nothing that we can do about that. People who won't, or can't, update get screwed by all kinds of things. | There is no possible fix to Java, as this is primarily an issue | between CLDR and TZDB. The two have a subtle API linkage which has | perhaps never been clearly spelled out here. Yes, they do, that ought to be obvious - the linkage is not (or should not be) subtle - it should be obvious. | CLDR provides textual names for time-zones, as an array [winter, | summer]. That itself is a bug. It assumes there are just two (not including for the "generic" name, mentioned in a later message from Yoshito Umaoka, which is probably the more useful one of the three anyway) - and there is no guarantee that will (or even always has) remain true. There is nothing to stop some locality (probably one at a high latitude) from deciding that they should advance the clocks in early spring, and then advance them further in early-mid summer, returning to the intermediate (or some other) value in late summer, and then to the original in late autumn (or fall if autumn happens to be called that in the relevant location). What's more, they could give 4 different names to the 3 (or 4) different offsets, perhaps "winter time" "spring time" "summer time" and "autumn time" with 4 different abbreviations. There could even be a mid winter fallback of even more, just as there could be a mid summer skip forward of more. Calling any of those offsets "standard" and the others as something different is really nonsense, though the jurisdiction (and people) might pick that label - but when they do, we should all remember that it is just another name. One offset is mot more blessed than any other because it happens to be labelled as the "standard" time. It might be different if we defined "standard time" to be the nearest "natural" offset based on lines of longitude - but with what resolution? And how would you apply that to China or India? So we don't do that. No-one does. CLDR (and its clients) needs to be able to represent all this. Tzdb can. CLDR must also handle places which (given the durations of the two periods that is common these days) decide that "standard" time be the one that applies for longer each year, and so should be the time in summer, and in winter the clocks should be set backwards some number of minutes for a few months, so it does not remain dark quite so late in the mornings ("darkness saving time" - aka DST). | As a much larger project with considerable history the order | of that array is not going to change. More than that needs to change, the order is not, or should not be, material. Just accept it - the design is broken, and must be fixed. | (I'm using winter and summer for CLDR for this email to aid clarity, | they refer to them as standard and daylight). Either way exposes the broken assumption that there are just two. | TZDB provides the offsets, SAVE values and a short text string. This | text string - GMT/IST or IST/GMT - is not directly linkable to the | data CLDR provides. It probably should be, probably when accompanied by the offset and the relevant time (perhaps the offset is less needed, or useful), those should be the key to the translated strings. But not as indexes into an array, that's just plain stupid. As database keys (for "database" in the general, not implying anything SQL based or similar). Alternatively, perhaps localized zoneinfo files should be used instead, built from a modified zic, which embeds the localized names (for some particular locality) with the raw data (probably in a similar way to, or perhaps instead of, how the abbreviations are handled now). That would mean one set of zoneinfo files for each locality an installation wants to support, but zoneinfo files are not really all that big (and adding a few extra strings to them would not make much difference) so this should not be seen as too much of a drawback - then CLDR users would simply use those files instead of the normal ones (if those even continue to exist on the system) for all purposes. This would obviously handle the problem of the two being updated independantly fairly easily. It does mean that if the "normal" files continue to exist, as both cldr and older applications both exist on the system, then those would need to be updated together. This should not be a problem, the update of one is simply not made available until both are ready. | Although it may seem that you can use the text | from TZDB as a key to lookup the correct value in CLDR, I know from | painful experience that approach fails (as the TZDB text varies over | time, Yes, and when it does the CLDR strings ("translations" into local formats) [ translations in quotes as I know that is not exactly what they are ] may need to change as well. There are multiple reasons why the TZDB names might change, some are, frankly, silly, but others represent real changes in what the local users call their times. In some cases the CLDR strings may have already matched local expectations, and nothing needs to alter, but in others the local's name might have changed (in their language, as well as in English) and the CLDR strings need to be updated (augmented). This is why the CLDR data should really be updated (if required) and (always) transmitted whenever the tzdb (zoneinfo) data changes. | has the same text in winter and summer, or isn't even text). I have no idea what the latter means - they are all text (we do not define zone abbreviations as random binary), unless you mean the +04 types, which are text, just text containing digits and +/- signs, rather than only letters. But you're right the "sometimes the same" (which is actually a very sane choice) means that you cannot use the abbreviation alone to map. However, the name, and the time to which it is being applied, is enough (and perhaps to avoid running that time through localtime() or its equivalent again just to get the offset, probably that as a param as well. We know localtime() must have been run already, or the data currently used would not be available.) | Thus, the only reliable way to pick which piece of CLDR data is needed | is from the offsets. Not even that alone, as the same offset can have different names during different periods. That (unlike some of my potential scenarios) has actually been observed in the past, and CLDR needs to handle that we well. It is simply untrue, and incorrect, to assume that if (in locaiity X) times at offset N are called ABC and times at offset M are called DEF today, than that was true last year. The old and the new names need to be available and applied to the appropriate times. This is true just as it is true that CLDR data is needed for more than calendaring applications - the only thing that matters is not just when the next meeting is schedueled (with the day and month, and timezone names converted to the local correct forms.) | For 20 years, this has been done in a simple and straightforward way - | if (raw-offset != actual-offset) then CLDR uses summer text and array | element 1. So, for 20 years there has been a latent bug. If for 20 years there has been a latent bug that allows a security breach, are you going to simply say "it has been there too long, we can't fix it now" ? Really? It makes no difference how old it is, a bug is a bug, and needs to be fixed. | This provides the necessary glue to link the two projects: It is the wrong glue. | TZDB has always had the raw and actual offsets What on earth is the "raw offset"? I somehow suspect that you (and perhaps CLDR in general) is reading too much into the tzdb source files. 99.9999% of people (not being zic) should really be ignoring those files, and everything they contain (the remaining percentage are the people who maintain the data - all 10 or 20 or so of them in the world). Everything else should be based upon the zoneinfo output files from zic - and that has no notion of a "raw" offset at all, all that exists, and all that you can ever assume, is that for some period of time (or indefinite length, starting at arbitrary and often unpredictable instants) a particular timezone will be at some offset from UTC. It might also be associated with some name (in reality, many are not, as Paul keeps pointing out, many of the abbreviated names that tzdb contain were purely invented for tzdb, because the (US centric) UNIX API/ABI required them - some of those are the ones being turned into numeric offsets represented as text strings - it makes no difference in the zone concerned, as there the time is just "the time" it has no other name (we really should have no abbreviation at all, and CLDR should have no translation of it). | the same in winter and different in summer, Once upon a time, the world was always flat, everyone knew that, the pope even proclaimed it... | so this has always worked. The latent bug was not exposed. That is not "worked" it is rather "managed to survive". | The Ireland proposal breaks this, with (raw-offset != actual-offset) | meaning winter, instead of summer. It is fair for TZDB to complain | that CLDR is inflexible with its definitions, but the reality is that | this was and is the only way to connect two separately developed | projects (where API stability is vital). Nonsense. It was just someone's idea of something they thought would work, and which seemed to - but it was based upon unfounded (and incorrect) assumptions about the natire of civil time, and how it can be expected to work. | In order for TZDB and CLDR to co-exist, it is *required* that the raw | offset equals the actual offset in winter, No, it is *required* for CLDR to be fixed. What is happening now is obviously incorrect. | This isn't a change that can be delayed for a year. Oh good, so we can make it now? | This interpretation of inSummerTime() relies on positive SAVE values, So, fix it. It is broken. | is part of the public API of TZDB just as much as the source code file | format is. If that's all, then we have no problem, as the source file format should not be regarded as part of anything except the method by which we happen to represent the data before zic converts it to zoneinfo. The source format has changed, and will change again - that is guaranteed. The zoneinfo format (in binary form, or converted to text) is designed to be immune to all of the schenanigans that go on, and really is what everyone should be using. If anyone believes that they need the source files for anything other than feeding to zic (or some equivalent program for systems that cannot run it, if there are any) then that almost guarantees that they are making some unststainable assumptions, which will, one day, be proven false. We (of course) attempt to remain backward compatible, but as legislatures (and the people under their governance) do weirder and weirder things, we are likely to find that the current language is incapable of expressing what needs to be expressed, and it ill be extended. I know there are others that read it, but this should be treated in a similar way to the way that compilers treat programming language specifications - when the language is extended (as all that are not dead have happen) the compilers all need to be updated to deal. Similarly, when tax legislation is amended (about the only thing that changes even more frequently, and for less rational reasons than timezones) the accountants, and the software they use, needs to be updated to deal with that. Updates/changes are simply a fact of life, there is nothing that is guaranteed (not really even death or taxes) that we can promise will never change. Hopefully zoneinfo files will not need much - though it aready has changed when 64 bit time support was added, and might need more, if people dealing 2038 issues find some innovative way to allow 32 bit timestamps to keep working, in some fashion, beyond 2038 in order to retain compat with old databases that cannot be updated easily. Everyone needs to remain aware of this. Sticking our heads in the sand and proclaiming "it always worked in the past, it must be made to continue working in the future" is, frankly, absurd. kre ps: I am sure apologies will be needed, I have tried to find and correct all my typos, but right now, my e-mail environment is horribly challenged, and I have no way to rationally do spell or grammar checks I normally would (well sometimes) attempt. So, consider that for any unfound mistakes, apologies are tendered.
On 24 January 2018 at 11:19, Robert Elz <kre@munnari.oz.au> wrote:
| CLDR provides textual names for time-zones, as an array [winter, | summer].
That itself is a bug. It assumes there are just two (not including for the "generic" name, mentioned in a later message from Yoshito Umaoka, which is probably the more useful one of the three anyway) - and there is no guarantee that will (or even always has) remain true.
There is nothing to stop some locality (probably one at a high latitude) from deciding that they should advance the clocks in early spring, and then advance them further in early-mid summer, returning to the intermediate (or some other) value in late summer, and then to the original in late autumn (or fall if autumn happens to be called that in the relevant location). What's more, they could give 4 different names to the 3 (or 4) different offsets, perhaps "winter time" "spring time" "summer time" and "autumn time" with 4 different abbreviations.
There could even be a mid winter fallback of even more, just as there could be a mid summer skip forward of more.
In fact, this has apparently been the case for Antarctica/Troll for quite some time. Although the data is a little rough, and it is currently commented out to avoid compatibility issues, the clear intent is to eventually model it as correctly as possible — a warning that has been present in this project for nearly four years already. https://github.com/eggert/tz/blob/975e499378112668e2e8b495badce684828aeec0/a... Indeed, CLDR having an array that only allows for two descriptive names *is* a bug and an incompatibility with tz. If proper attention had been paid to the implicit dependencies of that project, this would be well-known already. While CLDR addresses this issue, some additional thought should go into how linkages with tz data are handled, as Robert Elz mentioned, so that if the translated/descriptive strings for a time zone need to depend on the version of the tz data that generated them, they can… at which point the problem of the projects independently patching will be about as solved as it can be. -- Tim Parenti
99.9999% of people (not being zic) should really be ignoring those files, and everything they contain (the remaining percentage are the people who maintain the data - all 10 or 20 or so of them in the world). Everything else should be based upon the zoneinfo output files from zic
This hasn't been true for many many years. The source files are parsed by every downstream program I know. Its been discussed before as to why this is. I'd strongly suggest accepting that the source files are a primary interface, which is why negative SAVE values matter to downstream users. As for the rest, well I'm not going to reply to each line. With no acceptance of the concept of backwards compatibility, discussion is pretty pointless. If there was a simple bug fix that solves all the problems, I'd gladly do my part. There isn't such a fix - every avenue other than insisting on positive SAVE values will make things worse. Want to make things truly better? Agree to move TZDB under the auspices of CLDR, so it can be managed by a paid team who actually understand stability and compatibility, and the trade off of those against some abstract notion of purity. As a combined dataset, there would be the ability to solve the text problem in a realistic and pragmatic way. TZDB is not the centre of the universe. It is a small cog in a much bigger machine. Its time to accept that. Stephen On 24 January 2018 at 16:19, Robert Elz <kre@munnari.oz.au> wrote:
From: Stephen Colebourne <scolebourne@joda.org> Date: Tue, 23 Jan 2018 18:42:29 +0000 Subject: Re: [tz] OpenJDK/CLDR/ICU/Joda issues with Ireland change
| Java time-zone data is updated using the tzupdater tool | [URL omitted here] | This will update the tzdb data, but not the CLDR-driven data that | drives the text.
That is most probably a mistake - the two should be linked, it is entirely possible that a zone might change its names (regardless of issues of when transitions occur, or what, if anything, is regarded as the "standard" time).
| Were the change to proceed, anyone running tzupdater | with the Ireland change would invert the meaning of inDaylightTime() | and access the wrong array element in the CLDR-driven data - a bug.
Yes, it would be, and CLDR or java (whichever has the issue, or both) should fix it. And fix it soon.
| And code changes don't help, as we'll see below.
Of course code changes help - there's a bug, fixing the bug will fix that.
And also of course, for people who don't update, the bug will continue to appear - as for any other bug, or security vunerability that is found and fixed. Nothing that we can do about that. People who won't, or can't, update get screwed by all kinds of things.
| There is no possible fix to Java, as this is primarily an issue | between CLDR and TZDB. The two have a subtle API linkage which has | perhaps never been clearly spelled out here.
Yes, they do, that ought to be obvious - the linkage is not (or should not be) subtle - it should be obvious.
| CLDR provides textual names for time-zones, as an array [winter, | summer].
That itself is a bug. It assumes there are just two (not including for the "generic" name, mentioned in a later message from Yoshito Umaoka, which is probably the more useful one of the three anyway) - and there is no guarantee that will (or even always has) remain true.
There is nothing to stop some locality (probably one at a high latitude) from deciding that they should advance the clocks in early spring, and then advance them further in early-mid summer, returning to the intermediate (or some other) value in late summer, and then to the original in late autumn (or fall if autumn happens to be called that in the relevant location). What's more, they could give 4 different names to the 3 (or 4) different offsets, perhaps "winter time" "spring time" "summer time" and "autumn time" with 4 different abbreviations.
There could even be a mid winter fallback of even more, just as there could be a mid summer skip forward of more.
Calling any of those offsets "standard" and the others as something different is really nonsense, though the jurisdiction (and people) might pick that label - but when they do, we should all remember that it is just another name. One offset is mot more blessed than any other because it happens to be labelled as the "standard" time. It might be different if we defined "standard time" to be the nearest "natural" offset based on lines of longitude - but with what resolution? And how would you apply that to China or India? So we don't do that. No-one does.
CLDR (and its clients) needs to be able to represent all this. Tzdb can. CLDR must also handle places which (given the durations of the two periods that is common these days) decide that "standard" time be the one that applies for longer each year, and so should be the time in summer, and in winter the clocks should be set backwards some number of minutes for a few months, so it does not remain dark quite so late in the mornings ("darkness saving time" - aka DST).
| As a much larger project with considerable history the order | of that array is not going to change.
More than that needs to change, the order is not, or should not be, material.
Just accept it - the design is broken, and must be fixed.
| (I'm using winter and summer for CLDR for this email to aid clarity, | they refer to them as standard and daylight).
Either way exposes the broken assumption that there are just two.
| TZDB provides the offsets, SAVE values and a short text string. This | text string - GMT/IST or IST/GMT - is not directly linkable to the | data CLDR provides.
It probably should be, probably when accompanied by the offset and the relevant time (perhaps the offset is less needed, or useful), those should be the key to the translated strings. But not as indexes into an array, that's just plain stupid. As database keys (for "database" in the general, not implying anything SQL based or similar).
Alternatively, perhaps localized zoneinfo files should be used instead, built from a modified zic, which embeds the localized names (for some particular locality) with the raw data (probably in a similar way to, or perhaps instead of, how the abbreviations are handled now).
That would mean one set of zoneinfo files for each locality an installation wants to support, but zoneinfo files are not really all that big (and adding a few extra strings to them would not make much difference) so this should not be seen as too much of a drawback - then CLDR users would simply use those files instead of the normal ones (if those even continue to exist on the system) for all purposes.
This would obviously handle the problem of the two being updated independantly fairly easily.
It does mean that if the "normal" files continue to exist, as both cldr and older applications both exist on the system, then those would need to be updated together. This should not be a problem, the update of one is simply not made available until both are ready.
| Although it may seem that you can use the text | from TZDB as a key to lookup the correct value in CLDR, I know from | painful experience that approach fails (as the TZDB text varies over | time,
Yes, and when it does the CLDR strings ("translations" into local formats) [ translations in quotes as I know that is not exactly what they are ] may need to change as well. There are multiple reasons why the TZDB names might change, some are, frankly, silly, but others represent real changes in what the local users call their times. In some cases the CLDR strings may have already matched local expectations, and nothing needs to alter, but in others the local's name might have changed (in their language, as well as in English) and the CLDR strings need to be updated (augmented).
This is why the CLDR data should really be updated (if required) and (always) transmitted whenever the tzdb (zoneinfo) data changes.
| has the same text in winter and summer, or isn't even text).
I have no idea what the latter means - they are all text (we do not define zone abbreviations as random binary), unless you mean the +04 types, which are text, just text containing digits and +/- signs, rather than only letters.
But you're right the "sometimes the same" (which is actually a very sane choice) means that you cannot use the abbreviation alone to map. However, the name, and the time to which it is being applied, is enough (and perhaps to avoid running that time through localtime() or its equivalent again just to get the offset, probably that as a param as well. We know localtime() must have been run already, or the data currently used would not be available.)
| Thus, the only reliable way to pick which piece of CLDR data is needed | is from the offsets.
Not even that alone, as the same offset can have different names during different periods. That (unlike some of my potential scenarios) has actually been observed in the past, and CLDR needs to handle that we well.
It is simply untrue, and incorrect, to assume that if (in locaiity X) times at offset N are called ABC and times at offset M are called DEF today, than that was true last year. The old and the new names need to be available and applied to the appropriate times. This is true just as it is true that CLDR data is needed for more than calendaring applications - the only thing that matters is not just when the next meeting is schedueled (with the day and month, and timezone names converted to the local correct forms.)
| For 20 years, this has been done in a simple and straightforward way - | if (raw-offset != actual-offset) then CLDR uses summer text and array | element 1.
So, for 20 years there has been a latent bug. If for 20 years there has been a latent bug that allows a security breach, are you going to simply say "it has been there too long, we can't fix it now" ?
Really?
It makes no difference how old it is, a bug is a bug, and needs to be fixed.
| This provides the necessary glue to link the two projects:
It is the wrong glue.
| TZDB has always had the raw and actual offsets
What on earth is the "raw offset"?
I somehow suspect that you (and perhaps CLDR in general) is reading too much into the tzdb source files.
99.9999% of people (not being zic) should really be ignoring those files, and everything they contain (the remaining percentage are the people who maintain the data - all 10 or 20 or so of them in the world).
Everything else should be based upon the zoneinfo output files from zic - and that has no notion of a "raw" offset at all, all that exists, and all that you can ever assume, is that for some period of time (or indefinite length, starting at arbitrary and often unpredictable instants) a particular timezone will be at some offset from UTC. It might also be associated with some name (in reality, many are not, as Paul keeps pointing out, many of the abbreviated names that tzdb contain were purely invented for tzdb, because the (US centric) UNIX API/ABI required them - some of those are the ones being turned into numeric offsets represented as text strings - it makes no difference in the zone concerned, as there the time is just "the time" it has no other name (we really should have no abbreviation at all, and CLDR should have no translation of it).
| the same in winter and different in summer,
Once upon a time, the world was always flat, everyone knew that, the pope even proclaimed it...
| so this has always worked.
The latent bug was not exposed. That is not "worked" it is rather "managed to survive".
| The Ireland proposal breaks this, with (raw-offset != actual-offset) | meaning winter, instead of summer. It is fair for TZDB to complain | that CLDR is inflexible with its definitions, but the reality is that | this was and is the only way to connect two separately developed | projects (where API stability is vital).
Nonsense. It was just someone's idea of something they thought would work, and which seemed to - but it was based upon unfounded (and incorrect) assumptions about the natire of civil time, and how it can be expected to work.
| In order for TZDB and CLDR to co-exist, it is *required* that the raw | offset equals the actual offset in winter,
No, it is *required* for CLDR to be fixed. What is happening now is obviously incorrect.
| This isn't a change that can be delayed for a year.
Oh good, so we can make it now?
| This interpretation of inSummerTime() relies on positive SAVE values,
So, fix it. It is broken.
| is part of the public API of TZDB just as much as the source code file | format is.
If that's all, then we have no problem, as the source file format should not be regarded as part of anything except the method by which we happen to represent the data before zic converts it to zoneinfo.
The source format has changed, and will change again - that is guaranteed.
The zoneinfo format (in binary form, or converted to text) is designed to be immune to all of the schenanigans that go on, and really is what everyone should be using. If anyone believes that they need the source files for anything other than feeding to zic (or some equivalent program for systems that cannot run it, if there are any) then that almost guarantees that they are making some unststainable assumptions, which will, one day, be proven false.
We (of course) attempt to remain backward compatible, but as legislatures (and the people under their governance) do weirder and weirder things, we are likely to find that the current language is incapable of expressing what needs to be expressed, and it ill be extended.
I know there are others that read it, but this should be treated in a similar way to the way that compilers treat programming language specifications - when the language is extended (as all that are not dead have happen) the compilers all need to be updated to deal. Similarly, when tax legislation is amended (about the only thing that changes even more frequently, and for less rational reasons than timezones) the accountants, and the software they use, needs to be updated to deal with that.
Updates/changes are simply a fact of life, there is nothing that is guaranteed (not really even death or taxes) that we can promise will never change. Hopefully zoneinfo files will not need much - though it aready has changed when 64 bit time support was added, and might need more, if people dealing 2038 issues find some innovative way to allow 32 bit timestamps to keep working, in some fashion, beyond 2038 in order to retain compat with old databases that cannot be updated easily.
Everyone needs to remain aware of this. Sticking our heads in the sand and proclaiming "it always worked in the past, it must be made to continue working in the future" is, frankly, absurd.
kre
ps: I am sure apologies will be needed, I have tried to find and correct all my typos, but right now, my e-mail environment is horribly challenged, and I have no way to rationally do spell or grammar checks I normally would (well sometimes) attempt. So, consider that for any unfound mistakes, apologies are tendered.
On Jan 24, 2018, at 1:36 PM, Stephen Colebourne <scolebourne@joda.org> wrote:
... Want to make things truly better? Agree to move TZDB under the auspices of CLDR, so it can be managed by a paid team who actually understand stability and compatibility, and the trade off of those against some abstract notion of purity.
I think that is a totally unwarranted slam against Paul Eggert and Arthur Olson who have done most of the work for many years. paul
From: Stephen Colebourne <scolebourne@joda.org> Date: Wed, 24 Jan 2018 18:36:42 +0000 Subject: Re: [tz] OpenJDK/CLDR/ICU/Joda issues with Ireland change [quoting me}: | > 99.9999% of people (not being zic) should really be ignoring those | > files, | This hasn't been true for many many years. The source files are parsed | by every downstream program I know. You mean that firefox parses the source tzdb files? That's interesting, I never knew that. I assume it is a "downstream program you know [of]" and it certainly processes times (and java, or at least, javascript, and yes, I know they are different things). Are you sure, or were you just grandstanding? | Its been discussed before as to why this is. Being discussed does not make it agreed. Or rational. | I'd strongly suggest accepting that the source files are | a primary interface, which is why negative SAVE values matter to | downstream users. And that makes no sense at all - if you were really parsiong the tzdata source files, and doing it rationally, you'd be able to handle negative save values with no problems at all. The tzcode does. Other parsers (as much as I disagree with some of them even existing) have reported handling them as well. It is nothing to do with the source format that (seems to) be a problem with negative SAVE, but just the sloppy, lazy, way that the CLDR project chose to map from the zone abbreviations from the tzdb sources to the localized versions. That problem would exist if you were (more sanely) using the binary files (which is what I suspect that the applications which use the buggy code actually do). The method used is simply wrong. | There isn't such a fix - every avenue | other than insisting on positive SAVE values will make things worse. Nonsense. But before we can discuss ways it can (or could) be fixed you need to admit that you understand the magnitude of the problem that you are actually facing. Negative SAVE is just the tip of the iceberg. But believe me, I understand the issue with backwards compat. But this is one time when you might just need to deprecate the old interfaces and implement new ones - allow the old interfaces to remain, subject to all of their limitations, and occasional erroneous results, and when a problem is reported caused by an application that is doing things the old way, show them how to use the (assumed non-broken) new interface which actually works. Whatever ends up happening with negative SAVE, you're eventually going to have to do this anyway, at least as best I can follow the current interface (based upon what has been reported here) it is eventually going to fail cases that you cannot just paper over by altering the definition of reality in a small (seemingly harmless) way. Sticking your head in the sand now is just going to make for a bigger problem later. | Want to make things truly better? Agree to move TZDB under the | auspices of CLDR, so it can be managed by a paid team who actually | understand stability and compatibility, You mean the team that somehow came to believe that the tzdata source format was something they should rely upon? That would be insane. You do realise that this (tzdata) is just a collection of data, it is all public domain, there is nothing stopping this great CLDR team of professionals from collecting the same data (you can take as much as you like from the tzdb files, now and in the future) and creating your new stable wonderful format, that you can rely on forever. That is, you do not need permission from anyone here to simply do what you claim that you want to do. That might (if you design a new source format that better meets your needs than the tzdb source file format does) make life easier for you, but by itself it won't fix the problems that need to be solved before the CLDR localized zone name (abbreviations, and long forms) problems can be fixed. | TZDB is not the centre of the universe. It is a small cog in a much | bigger machine. Its time to accept that. Huh? Of course that's correct, no-one ever claimed otherwise. Where did that red herring come from, and why? kre
On 2018-01-24 18:36:42 (+0000), Stephen Colebourne wrote:
99.9999% of people (not being zic) should really be ignoring those files, and everything they contain (the remaining percentage are the people who maintain the data - all 10 or 20 or so of them in the world). Everything else should be based upon the zoneinfo output files from zic
This hasn't been true for many many years. The source files are parsed by every downstream program I know. Its been discussed before as to why this is. I'd strongly suggest accepting that the source files are a primary interface, which is why negative SAVE values matter to downstream users.
As for the rest, well I'm not going to reply to each line. With no acceptance of the concept of backwards compatibility, discussion is pretty pointless. If there was a simple bug fix that solves all the problems, I'd gladly do my part. There isn't such a fix - every avenue other than insisting on positive SAVE values will make things worse.
Want to make things truly better? Agree to move TZDB under the auspices of CLDR, so it can be managed by a paid team who actually understand stability and compatibility, and the trade off of those against some abstract notion of purity. As a combined dataset, there would be the ability to solve the text problem in a realistic and pragmatic way.
Wow. That's harsh.
TZDB is not the centre of the universe. It is a small cog in a much bigger machine. Its time to accept that.
Neither is Java the centre of the universe. Java is just one terribly broken downstream consumer with flawed expectations. But I don't think pointing fingers (at anyone) is going to make anything actually better. The tzdb could (and should) improve "expectation management" but downstram consumers could (and should) also improve their expectations from the data. As usual "be strict in what you send and liberal in what you accept" should be a guiding principle. The tzdb should be as correct as possible and downstream consumers should be (become) as flexible as possible to accept the crazy reality we happen to live in. Currently, historically and in the future. People in control of timezones *will* continue to mess with them in ways we don't expect. Negative savings have happened once. There is nothing to stop anyone from having a negative savings of twelve minutes relative to their "standard" timezone which happens to be called summer time... And software will just have to cope or people will get cranky. Philip -- Philip Paeps Senior Reality Engineer Ministry of Information
On 24 January 2018 at 22:44, Philip Paeps <philip@trouble.is> wrote:
As usual "be strict in what you send and liberal in what you accept" should be a guiding principle. The tzdb should be as correct as possible and downstream consumers should be (become) as flexible as possible to accept the crazy reality we happen to live in. Currently, historically and in the future.
It is not possible to transform a source file containing negative SAVE values to one containing positive SAVE values without additional information not present in the file. This is because of edge cases where the time-zone started or finished using DST. Given this unfortunate fact, the only choice left to downstream consumers wanting to retain standard=winter (for compatibility reasons) is to alter the input source file. Which is effectively a fork of the TZDB project. Stephen
On 2018-01-24 18:36, Stephen Colebourne wrote:
Want to make things truly better? Agree to move TZDB under the auspices of CLDR, so it can be managed by a paid team who actually understand stability and compatibility, and the trade off of those against some abstract notion of purity. As a combined dataset, there would be the ability to solve the text problem in a realistic and pragmatic way. I disagree. I have observed tzdb since more than 20 years and it has always been a treasure trove of carefully researched information on historical time scales. You may call "abstract purity" what I perceive as historical accuracy -- but I believe it is useful to clearly present such historical information to the (many) diverse computer interfaces that need such data, without showing noise without such corroboration. Many recent changes effected under the lead of Paul Eggert have clarified the scope and the limits of the tz data.
As an example for the advantage of such separation consider the issue of naming time zone timescales. tzdb has decided not to deal with this issue; its careful research would involve local and political issues beyond the scope of the tzdb project. On the other hand, the CLDR project is well suited to tackle this issue, but only for the presently used time scales.
TZDB is not the centre of the universe. It is a small cog in a much bigger machine. It's time to accept that.
The issue at hand is the recently (2018a) proposed change to Europe/Dublin which changes the dst bit when UTC >= 1971-10-31 + 02 h. This change in fact contradicts the assertion which has been made continually by tzdb in newctime.3 since at least 1993-01-08 (for 25 years), that "Tm_isdst is non-zero if summer time is in effect.", and several other assertions to the same effect throughout tzdb. There is also the claim that tm_isdst is a "vestigial" interface for which a change hindering customers in any way should obviously be avoided. I therefore think it is only fair that tzdb has suspended the proposed change. After all, it is tzdb that is proposed to change their documented interface, not the users of tzdb. Before this change proceeds, it is, in my opinion, necessary to obtain agreement on what the new significant (non-vestigial) role of the dst bit, if any, should be in the future (beyond the single case of Ireland). Of course, the change, if any, must be documented clearly in decent time before it is implemented, so that an upgrade path can be designed by all users. I doubt that all this would ever happen if tzdb was a part of CLDR! Michael Deckers.
On 01/26/2018 01:39 PM, Michael H Deckers via tz wrote:
Before this change proceeds, it is, in my opinion, necessary to obtain agreement on what the new significant (non-vestigial) role of the dst bit, if any, should be in the future (beyond the single case of Ireland).
I don't think there's much disagreement over what the bit means in the context of the POSIX API. For example, POSIX specifies that this environment setting: TZ='IST-1GMT0,M10.5.0,M3.5.0/1' means that standard time is abbreviated "IST" and is 1 hour ahead of UTC, that daylight saving time is abbreviated "GMT" and is at UTC, and that the European Union's current rules are in place for when to change UTC offsets. tzcode and every other POSIX-compatible system support this, and set tm_isdst=1 in winter with a negative DST offset when operating in this locale. As far as I know there's no serious dispute about this, nor is anybody seriously proposing to change this behavior. That is, the tm_isdst API may be vestigial, but it's too late to change what it means. The disagreement is not over the role of the tm_isdst bit; it is whether tzdb and/or OpenJDK should model Irish Standard Time as a standard time (tm_isdst=0) or as a daylight saving time (tm_isdst=1), and similarly for any other locales in a similar boat where standard time is ahead of the alternative (or "daylight saving") time.
On 2018-01-26 22:16, Paul Eggert wrote about the tzdb dst bit:
On 01/26/2018 01:39 PM, Michael H Deckers via tz wrote:
Before this change proceeds, it is, in my opinion, necessary to obtain agreement on what the new significant (non-vestigial) role of the dst bit, if any, should be in the future (beyond the single case of Ireland).
I don't think there's much disagreement over what the bit means in the context of the POSIX API. For example, POSIX specifies that this environment setting:
TZ='IST-1GMT0,M10.5.0,M3.5.0/1'
means that standard time is abbreviated "IST" and is 1 hour ahead of UTC, that daylight saving time is abbreviated "GMT" and is at UTC, and that the European Union's current rules are in place for when to change UTC offsets. tzcode and every other POSIX-compatible system support this, and set tm_isdst=1 in winter with a negative DST offset when operating in this locale. As far as I know there's no serious dispute about this, nor is anybody seriously proposing to change this behavior. That is, the tm_isdst API may be vestigial, but it's too late to change what it means.
No, no: POSIX does not say anything about the setting of the tm_isdst member for a given TZ string; nor does it talk which part of the TZ string describes daylight saving time. I am reluctant to accept any assertion of the type it is clear that POSIX means XXX without a specific reference implying it; if I have to write reliable code, I have to know exactly what I can rely on. Michael Deckers.
Michael H Deckers wrote:
POSIX does not say anything about the setting of the tm_isdst member for a given TZ string
Yes it does. If what you're saying is that a POSIX implementation can set tm_isdst=0 during Irish Standard Time when TZ='IST-1GMT0,M10.5.0,M3.5.0/1', then taking that idea to its logical conclusion a POSIX implementation can set tm_isdst=0 during Pacific Daylight Time when TZ='PST8PDT,M3.2.0,M11.1.0'. Or the implementation could go further and set tm_isdst=1 during odd-numbered seconds and tm_isdst=0 during even-numbered seconds, because "POSIX does not say anything". That's not how real systems work and that's not how POSIX is intended to work. POSIX's description of TZ makes it clear that the 'dst' part of the TZ string is intended for use when daylight saving time (a.k.a. "the alternative timezone") is in effect, and its description of is_dst under time.h makes it clear that tm_isdst should be positive when daylight saving time is in effect and zero when it is not in effect.
On 2018-01-28 19:43, Paul Eggert wrote:
Michael H Deckers wrote:
POSIX does not say anything about the setting of the tm_isdst member for a given TZ string
Yes it does. ....
No. Everybody can convince themselves about what POSIX is saying by looking at [http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_...]: std and dst Indicate no less than three, nor more than {TZNAME_MAX}, bytes that are the designation for the standard (std) or the alternative (dst -such as Daylight Savings Time) timezone. Only std is required; if dst is missing, then the alternative time does not apply in this locale. If POSIX had meant to say that the "alternative time" is "Daylight Savings" in every case, then they could have said so (and could have avoided the term "alternative time" altogether). They did not. Michael Deckers.
On 01/29/2018 04:23 AM, Michael H Deckers wrote:
Everybody can convince themselves about what POSIX is saying by looking at
All I can say is that that you're misinterpreting POSIX here. You can ask the standardization committee for an official interpretation if you like. There are lots of places where POSIX deliberately does not specify behavior, but this isn't one of them.
On 2018-01-29 10:49, Paul Eggert wrote:
On 01/29/2018 04:23 AM, Michael H Deckers wrote:
Everybody can convince themselves about what POSIX is saying by looking at
All I can say is that that you're misinterpreting POSIX here. You can ask the standardization committee for an official interpretation if you like. There are lots of places where POSIX deliberately does not specify behavior, but this isn't one of them.
I think POSIX defers to C and much here is implementation defined, except where specified by POSIX, so please notice that tm_isdst may be positive not just -1, 0, 1; any number of names could be supported; if a "generic" CLDR time zone name or abbreviation is available, it /could possibly/ be provided as a default in case tm_isdst is returned as -1, for some reason other than determining the time zone: " ISO/IEC 9899-2011[2012] ... Information technology -- Programming languages -- C ... 7.27.1 Components of time ... int tm_isdst; // Daylight Saving Time flag The value of tm_isdst is positive if Daylight Saving Time is in effect, zero if Daylight Saving Time is not in effect, and negative if the information is not available. ... 7.27.2.3 The mktime function ... 320) Thus, a positive or zero value for tm_isdst causes the mktime function to presume initially that Daylight Saving Time, respectively, is or is not in effect for the specified time. A negative value causes it to attempt to determine whether Daylight Saving Time is in effect for the specified time. ... 7.27.3.5 The strftime function ... %z is replaced by the offset from UTC in the ISO 8601 format "-0430" (meaning 4 hours 30 minutes behind UTC, west of Greenwich), or by no characters if no time zone is determinable. [tm_isdst] %Z is replaced by the locale's time zone name or abbreviation, or by no characters if no time zone is determinable. [tm_isdst] " One of my systems is configured so that all date/time utilities and functions output or return: $ date +%c%z 2018 Jan 29 Mon 16:45:30-0700 which is how I would prefer to see long date/times everywhere in /my/ locale. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
Robert Elz said:
Once upon a time, the world was always flat, everyone knew that, the pope even proclaimed it...
Cite? -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
Robert Elz said:
Once upon a time, the world was always flat, everyone knew that, the pope even proclaimed it...
The _infallible_ pope? Which one was it? +------------------+--------------------------+----------------------------+ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org | +------------------+--------------------------+----------------------------+
Date: Thu, 25 Jan 2018 06:33:13 +0000 From: "Clive D.W. Feather" <clive@davros.org> Subject: Re: [tz] OpenJDK/CLDR/ICU/Joda issues with Ireland change | Robert Elz said: | > Once upon a time, the world was always flat, everyone knew that, | > the pope even proclaimed it... | Cite? Probably some urban legend, that is not important enough to be worth researching ... the point was that sometimes "facts" that are (almost) universally held to be true (and obvious) turn out not to be. kre
On 2018-01-24 23:33, Clive D.W. Feather wrote:
Robert Elz said:
Once upon a time, the world was always flat, everyone knew that, the pope even proclaimed it... Cite?
Some fundamentalists and liberal arts majors probably still believe that! Every blue water sailor (and passenger) from time immemorial knew it curved. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
On 01/24/2018 10:33 PM, Clive D.W. Feather wrote:
Robert Elz said:
Once upon a time, the world was always flat, everyone knew that, the pope even proclaimed it... Cite?
I doubt whether any Roman Catholic pope proclaimed the Earth flat as a matter of doctrine. That being said, flat-Earthers are still alive and kicking. We have one in California named "Mad Mike" Hughes who has said that on February 3 he will ride a steam-powered rocket that will be launched 550 meters into the air to "prove" that the world is flat. The relevance to time zones? One argument for a flat Earth is that if the Earth really were round and orbiting the sun, we'd have to adjust our clocks twelve hours every six months to account for the fact that the Sun had gotten to the "wrong" side of the Earth, and obviously we're not doing that so the Earth cannot be round. PS. Mad Mike is running for governor of California; the rocket is intended to launch his campaign. McKay T. Rebuffed flat earth rocketeer says he will actually launch himself into the sky at 500 MPH this time. Gizmodo. 2018-01-25. https://gizmodo.com/rebuffed-flat-earth-rocketeer-says-he-will-actually-lau-... Riz A. Flat Earth Theory: Why don’t our clocks have to change by 12 hours in 6 months? Metabunk. 2016-01-11. https://www.metabunk.org/explained-flat-earth-theory-why-don%E2%80%99t-our-c...
On 26 Jan 2018, at 07:35, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 01/24/2018 10:33 PM, Clive D.W. Feather wrote:
Robert Elz said:
Once upon a time, the world was always flat, everyone knew that, the pope even proclaimed it...
Cite?
A spherical earth pre-dates a Pope by quite a few centuries. One of the better known Greeks (whose name escapes me) even estimated the diameter of the earth. A flat earth is a relatively new fad.
I doubt whether any Roman Catholic pope proclaimed the Earth flat as a matter of doctrine. That being said, flat-Earthers are still alive and kicking. We have one in California named "Mad Mike" Hughes who has said that on February 3 he will ride a steam-powered rocket that will be launched 550 meters into the air to "prove" that the world is flat.
The relevance to time zones? One argument for a flat Earth is that if the Earth really were round and orbiting the sun, we'd have to adjust our clocks twelve hours every six months to account for the fact that the Sun had gotten to the "wrong" side of the Earth, and obviously we're not doing that so the Earth cannot be round.
Bizarre.
PS. Mad Mike is running for governor of California; the rocket is intended to launch his campaign. McKay T. Rebuffed flat earth rocketeer says he will actually launch himself into the sky at 500 MPH this time. Gizmodo. 2018-01-25. https://gizmodo.com/rebuffed-flat-earth-rocketeer-says-he-will-actually-lau-...
Riz A. Flat Earth Theory: Why don’t our clocks have to change by 12 hours in 6 months? Metabunk. 2016-01-11. https://www.metabunk.org/explained-flat-earth-theory-why-don%E2%80%99t-our-c...
John Haxby <john.haxby@oracle.com> wrote: |> On 26 Jan 2018, at 07:35, Paul Eggert <eggert@cs.ucla.edu> wrote: |> On 01/24/2018 10:33 PM, Clive D.W. Feather wrote: |>> Robert Elz said: |>> |>>> Once upon a time, the world was always flat, everyone knew that, |>>> the pope even proclaimed it... |>>> |>> Cite? | |A spherical earth pre-dates a Pope by quite a few centuries. One of \ |the better known Greeks (whose name escapes me) even estimated the \ |diameter of the earth. A flat earth is a relatively new fad. Now i am anything but a christian, yet i would support the philosophical reasoning. |> I doubt whether any Roman Catholic pope proclaimed the Earth flat \ |> as a matter of doctrine. That being said, flat-Earthers are still \ That is my view too. I also doubt that most of these most intellectual of their times ever meant that "flat" literally. At times they just did not know? The question always has been how to civilize the human being and how to make it reflect themselves. There are many many highly intellectual people out there which find things and know a lot, can even clone life itself (and getting better), but at the same time the same people want privileges and unfortunately not only those, but also many many others, and growing. So if all that fantastic knowledge leads to nothing else but the end of biodiversity, and the desire to turn Mars into the "paradise" that we currently destroy right here where we are at, isn't the world flat? You seem to fall off the end. By the way, at that time, here in Germany, with people dying from starvation in years with bad harvest, and thus any single square meter field meaning survival, people with Down-Syndrome were entitled to inherit, and "evil" humans had a legitimized force to fear (the inquisition). Uff. Well! Just imagine how hard life was, no matter whether poor or rich the most "ridiculous" illnesses could cause death, and you sit in an early mass on easter, the sunset will start rising through the coloured glasses in just five minutes from now, and the choir starts singing "Miserere mei, Deus" of Gregorio Allegri. Then it seems much better, even sacred, to stay in the middle of the world, than to fall off the end. I can recommend [1] which is unfortunately no longer available from Germany, the high C of the Westminster Abbey choir is the most beautiful that can be imagined, [2] is also somewhat legendary, but does not reach [1] by far. Happy hacking. [1] https://www.youtube.com/watch?v=Psf5Cqjpt7I [2] https://www.youtube.com/watch?v=YDOENZediM8 --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
Eratosthene measured the earth diameter somewhere around 200BC https://en.wikipedia.org/wiki/Eratosthenes#Measurement_of_the_Earth's_circumference A better question to ask is when circular earth became dominant belief. That, I am not aware of good documentation. That said, Columbus knew the earth to be round so it wasn't super obscure knowledge. NASA has a short history here: https://starchild.gsfc.nasa.gov/docs/StarChild/questions/question54.html On 26/01/2018 11:33 AM, Steffen Nurpmeso wrote:
John Haxby <john.haxby@oracle.com> wrote: |> On 26 Jan 2018, at 07:35, Paul Eggert <eggert@cs.ucla.edu> wrote: |> On 01/24/2018 10:33 PM, Clive D.W. Feather wrote: |>> Robert Elz said: |>> |>>> Once upon a time, the world was always flat, everyone knew that, |>>> the pope even proclaimed it... |>>> |>> Cite? | |A spherical earth pre-dates a Pope by quite a few centuries. One of \ |the better known Greeks (whose name escapes me) even estimated the \ |diameter of the earth. A flat earth is a relatively new fad.
Now i am anything but a christian, yet i would support the philosophical reasoning.
|> I doubt whether any Roman Catholic pope proclaimed the Earth flat \ |> as a matter of doctrine. That being said, flat-Earthers are still \
-- Oracle Email Signature Logo Patrice Scattolin | Software Development Manager | 514.905.8744 Oracle Cloud 600 Blvd de Maisonneuve West Suite 1900 Montreal, Quebec
On Fri, Jan 26, 2018, at 11:44, Patrice Scattolin wrote:
Eratosthene measured the earth diameter somewhere around 200BC https://en.wikipedia.org/wiki/Eratosthenes#Measurement_of_the_Earth's_circumference
A better question to ask is when circular earth became dominant belief. That, I am not aware of good documentation. That said, Columbus knew the earth to be round so it wasn't super obscure knowledge.
My understanding is: Everyone educated knew the world was round. Columbus believed, wrongly, that it was smaller than the actual size and therefore that it was practical to reach East Asia by sailing across what he believed to be an open ocean. The consensus view on Earth's diameter was more accurate, which is why he had trouble getting funding.
Patrice Scattolin wrote:
A better question to ask is when circular earth became dominant belief.
If by "dominant" you mean "known by educated people in the West", I would say that this occurred well before 200 B.C. And by "dominant" I mean really dominant: there was no real controversy about whether the earth was flat, as virtually every educated person on record wrote that it was round. (The idea that the ancients thought the world flat is a modern myth.) By the third century B.C. people in the West realized that clock settings varied by longitude. Perhaps some ancient Roman or Greek geographer even developed a precursor to tzdb, though I don't know of any such effort that survived. For more, please see: https://en.wikipedia.org/wiki/Myth_of_the_flat_Earth
Patrice Scattolin said:
Eratosthene measured the earth diameter somewhere around 200BC https://en.wikipedia.org/wiki/Eratosthenes#Measurement_of_the_Earth's_circumference
A better question to ask is when circular earth became dominant belief. That, I am not aware of good documentation. That said, Columbus knew the earth to be round so it wasn't super obscure knowledge.
My understanding is that nobody seriously disputed that. Rather, while the majority view agreed with Eratosthene's measurement, Columbus adhered to a minority that believed the diameter was rather smaller, meaning the distance from the west European coast to Japan was only about 4000 miles. As it turned out, Eratosthene was correct to within about 1%. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
Paul Eggert said:
Once upon a time, the world was always flat, everyone knew that, the pope even proclaimed it... Cite? I doubt whether any Roman Catholic pope proclaimed the Earth flat as a matter of doctrine. That being said, flat-Earthers are still alive and kicking.
That, I'm well aware of. But my understanding has always been that educated people and sailors, whether educated or not, have known for something like 2500 years that the earth is round. So I wanted a cite that the pope proclaimed the opposite. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
participants (28)
-
Brian Inglis -
Clive D.W. Feather -
Guy Harris -
Howard Hinnant -
J William Piggott -
John Hawkinson -
John Haxby -
Lester Caine -
Mark Davis ☕️ -
Meno Hochschild -
Michael Douglass -
Michael H Deckers -
Patrice Scattolin -
Paul Eggert -
Paul G -
Paul Goyette -
Paul.Koning@dell.com -
Peter Ilieve -
Philip Paeps -
Random832 -
Robert Elz -
scs@eskimo.com -
Steffen Nurpmeso -
Stephen Colebourne -
Steve Allen -
Tim Parenti -
Wallace, Malcolm -
Yoshito Umaoka