Proposed patch - Theory notes for backward file
Theory notes for backward file Note that the file exists for compatibility and canonicalization Explicitly note IDs in that file are deprecated Explicitly note that data set should make sense without backward ---------------- https://github.com/jodastephen/tz/commit/709e51dc7aeea8c2286b8a50c0d91d475b5... The aim of this patch is simple - to document what the backward file is for in more detail. Stephen ---------------- @@ -362,12 +362,19 @@ constraints. Although a 'zone.tab' location's longitude corresponds to its LMT offset with one hour for every 15 degrees east longitude, this relationship is not exact and is not true for 'time.tab'. +The file 'backward' consists of Link entries mapping deprecated names +to their replacements. Usage of a deprecated name is discouraged. +As such, the file permits canonicalization, where the use of a +deprecated name can be automatically replaced with the replacement. +If the data set is used with the 'backward' file file excluded, then +it must remain logical and complete. For example, there must be no +references to any deprecated name from any other file. + Older versions of this package used a different naming scheme, -and these older names are still supported. -See the file 'backward' for most of these older names +and these older names are still supported using 'backward'. (e.g. 'US/Eastern' instead of 'America/New_York'). -The other old-fashioned names still supported are -'WET', 'CET', 'MET', and 'EET' (see the file 'europe'). +Some other old-fashioned names are not deprecated, such as +'WET', 'CET', 'MET', and 'EET', see the 'europe' file.
Stephen Colebourne wrote:
+The file 'backward' consists of Link entries mapping deprecated names +to their replacements. Usage of a deprecated name is discouraged. +As such, the file permits canonicalization
We've never said the 'backward' names are deprecated before. Why start asking users to change names now? We have no plans to remove them. The business with 'canonical' is problematic, given that we're thinking of letting installers or users filter out names that are no different pre-1970 (or pre-whatever), and this may well omit names that would be considered 'canonical' by some. If we head in this direction, I expect that there will be no canonicalization that is good for all applications. Users who care only about post-1970 timestamps might consider Europe/Rome to be the canonical name for Europe/Vatican; users who care only about current and future timestamps might say Europe/Berlin; and users who care about political issues (or who assume the Vatican's secret archives will eventually uncover a local time differing from Rome's!) might consider Europe/Vatican to be canonical all by itself.
On 5 September 2013 02:48, Paul Eggert <eggert@cs.ucla.edu> wrote:
Stephen Colebourne wrote:
+The file 'backward' consists of Link entries mapping deprecated names +to their replacements. Usage of a deprecated name is discouraged. +As such, the file permits canonicalization
We've never said the 'backward' names are deprecated before. Why start asking users to change names now? We have no plans to remove them.
We've just had a long discussion to achieve the separation described in the proposal. Specifically, that files like zone.tab should not reference IDs in backward. I really want to see that captured so we don't have to repeat ourselves. If you want to weaken the wording, remove the word deprecated for example, I might be open to that. But I do believe that being clearer about why the file exists and what it can be used for by data consumers is helpful. Right now we have three things: - Zone entries - active Link entries, for locations that happen to have the entire local time history of another zone - inactive Link entries, that used to have meaning but are no longer favoured (the backward file) I'm arguing that the Theory file needs to be explicit about the difference between the last two. Not least because that distinction is important to some consumers of the data (I expect to use it to provide some form of canonicalization in JSR-310 at some point for example). BTW, I don't expect one form of canonicalization to be suitable for all, but I do think canonicalizing inactive IDs to active ones is the most sensible form of canonicalization (to avoid political offence for example). Stephen
Stephen Colebourne wrote:
Right now we have three things: - Zone entries - active Link entries, for locations that happen to have the entire local time history of another zone - inactive Link entries, that used to have meaning but are no longer favoured (the backward file)
It's not that simple. We also have aliases in the non-backward files. Two examples are 'Europe/Nicosia' and 'GMT'. I don't see why these names should be considered canonical; although they are active link entries, they are not for differing locations.
On 5 September 2013 13:15, Paul Eggert <eggert@cs.ucla.edu> wrote:
Stephen Colebourne wrote:
Right now we have three things: - Zone entries - active Link entries, for locations that happen to have the entire local time history of another zone - inactive Link entries, that used to have meaning but are no longer favoured (the backward file)
It's not that simple. We also have aliases in the non-backward files. Two examples are 'Europe/Nicosia' and 'GMT'. I don't see why these names should be considered canonical; although they are active link entries, they are not for differing locations.
My primary concern is ensuring that the backward file is removable - ie. no other files contain the "inactive" backward Link entries. Its this that I want to see documented more than anything else. Beyond that, I think consumers will (and already do) treat the backward file as a source of canonicalization, simply as a result of the above. I think that the ability to use it as such should be documented, but I can live with the minimal change proposed in the paragraph above, for example: "The file 'backward' consists of Link entries mapping two names. These are typically interpretted as a link to a modern name from an older name. If the data set is used with the 'backward' file file excluded, then it must remain logical and complete." Stephen
Stephen Colebourne wrote:
My primary concern is ensuring that the backward file is removable - ie. no other files contain the "inactive" backward Link entries.
These statements don't match. For the "backward" file to be removable what you need is that the "backward" file does not contain any "active" entries. No other files containing "inactive" entries is what's needed to have an easy way of feeding only "active" entries to zic. Maybe you want both of these things? I'd also like to see the active and inactive links distinguished somehow, for internal clarity. I'm not so concerned about the mechanism by which they're distinguished, but segregating them fully by file certainly has its attractions. FWIW, my winnowing branch effectively distinguishes between active and inactive links for geographical zones by whether it includes population data. (It doesn't thus distinguish active/inactive non-geographical zones.) -zefram
On 5 September 2013 15:15, Zefram <zefram@fysh.org> wrote:
Stephen Colebourne wrote:
My primary concern is ensuring that the backward file is removable - ie. no other files contain the "inactive" backward Link entries.
These statements don't match. For the "backward" file to be removable what you need is that the "backward" file does not contain any "active" entries. No other files containing "inactive" entries is what's needed to have an easy way of feeding only "active" entries to zic. Maybe you want both of these things?
I'd also like to see the active and inactive links distinguished somehow, for internal clarity. I'm not so concerned about the mechanism by which they're distinguished, but segregating them fully by file certainly has its attractions. FWIW, my winnowing branch effectively distinguishes between active and inactive links for geographical zones by whether it includes population data. (It doesn't thus distinguish active/inactive non-geographical zones.)
The backward file contains this line for example: Link Asia/Kolkata Asia/Calcutta I'm simply arguing that the Asia/Calcutta ID should only appear in backward, with Asia/Kolkata being used everywhere else. That way, excluding the backward file has no effect other than removing "no longer desirable" IDs. Stephen
Stephen Colebourne wrote:
I'm simply arguing that the Asia/Calcutta ID should only appear in backward, with Asia/Kolkata being used everywhere else. That way, excluding the backward file has no effect other than removing "no longer desirable" IDs.
Once again, these two statements don't match. The first argues that there should not be inactive links in non-backward files; the second that there should not be active links in the backward file. They're different, but related, objectives. I think you're really seeking both. -zefram
Zefram wrote:
For the "backward" file to be removable what you need is that the "backward" file does not contain any "active" entries. No other files containing "inactive" entries is what's needed to have an easy way of feeding only "active" entries to zic. Maybe you want both of these things?
I think Stephen wants both: he wants canonicalization, and he wants 'backward' to be removable. But 'backward' really has only the second role; it doesn't have the first one. It's reasonable to note the second role in 'Theory'. I've pushed the following proposed patch to do that; it says essentially what Stephen proposed, albeit more briefly. Canonicalization we can leave for another day. * Theory: Say that excluding 'backward' should not affect other data. Suggested by Stephen Colebourne in <http://mm.icann.org/pipermail/tz/2013-September/019939.html>. diff --git a/Theory b/Theory index 39ef7ac..13b5565 100644 --- a/Theory +++ b/Theory @@ -368,7 +368,8 @@ this relationship is not exact and is not true for 'time.tab'. Older versions of this package used a different naming scheme, and these older names are still supported. See the file 'backward' for most of these older names -(e.g. 'US/Eastern' instead of 'America/New_York'). +(e.g. 'US/Eastern' instead of 'America/New_York'); +excluding 'backward' should not affect the other data. The other old-fashioned names still supported are 'WET', 'CET', 'MET', and 'EET' (see the file 'europe').
On 5 September 2013 15:47, Paul Eggert <eggert@cs.ucla.edu> wrote:
* Theory: Say that excluding 'backward' should not affect other data. Suggested by Stephen Colebourne in <http://mm.icann.org/pipermail/tz/2013-September/019939.html>. diff --git a/Theory b/Theory index 39ef7ac..13b5565 100644 --- a/Theory +++ b/Theory @@ -368,7 +368,8 @@ this relationship is not exact and is not true for 'time.tab'. Older versions of this package used a different naming scheme, and these older names are still supported. See the file 'backward' for most of these older names -(e.g. 'US/Eastern' instead of 'America/New_York'). +(e.g. 'US/Eastern' instead of 'America/New_York'); +excluding 'backward' should not affect the other data. The other old-fashioned names still supported are 'WET', 'CET', 'MET', and 'EET' (see the file 'europe').
For the sake of moving forward I'll accept this as a compromise, although I don't think it goes far enough. Stephen
Stephen Colebourne wrote:
"The file 'backward' consists of Link entries mapping two names. These are typically interpretted as a link to a modern name from an older name. If the data set is used with the 'backward' file file excluded, then it must remain logical and complete."
backward is just another cross reference. Adding the data to the 'extended' file I'm outlining and add the from and to dates that the name was used for in published data ... puts back missing history ... This is starting to jell ... -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On Sep 4, 2013, at 10:01 AM, Stephen Colebourne <scolebourne@joda.org> wrote:
+The file 'backward' consists of Link entries mapping deprecated names +to their replacements. Usage of a deprecated name is discouraged. +As such, the file permits canonicalization, where the use of a +deprecated name can be automatically replaced with the replacement.
Note that, as Norbert Linderberg indicates, this is *NOT* the canonicalization being proposed for JavaScript; the canonicalization being proposed for JavaScript is to replace a name that is defined as a Link, *whether it's defined in the "backward" file or not*, with its replacement. It might be an useful form of canonicalization, but it's not the JavaScript form, for what that's worth.
participants (5)
-
Guy Harris -
Lester Caine -
Paul Eggert -
Stephen Colebourne -
Zefram