[PATCH] tzselect: new options -c COORD and -n LIMIT
These let the user select TZ based on geographical coordinates. Combining '-c COORD' with '-t time' better insulates the user from issues of nationality. * tzselect.8 (SYNOPSIS, OPTIONS): Document the new options. * tzselect.ksh: Implement them, using the great-circle special case of the Vicenty formula for distances on ellipsoids. (LC_ALL): Set to C, since tzselect is English only. That way, we treat decimal-points in -c option operands the same in all environments. (usage): Document new options. Document existing ones better. (output_distances): New variable. --- tzselect.8 | 26 ++++++++++ tzselect.ksh | 162 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---- 2 files changed, 177 insertions(+), 11 deletions(-) diff --git a/tzselect.8 b/tzselect.8 index b5e2c88..884af36 100644 --- a/tzselect.8 +++ b/tzselect.8 @@ -4,6 +4,12 @@ tzselect \- select a time zone .SH SYNOPSIS .B tzselect [ +.B \-c +.I coord +] [ +.B \-n +.I limit +] [ .B \-t .I zonetabtype ] [ @@ -21,6 +27,26 @@ The output is suitable as a value for the TZ environment variable. All interaction with the user is done via standard input and standard error. .SH OPTIONS .TP +.BI "\-c " coord +Instead of asking for continent and then country and then city, +ask for selection from time zones whose largest cities +are closest to the location with geographical coordinates +.IR coord . +Use ISO 6709 notation for +.IR coord , +for example, +.B "\-c\ \+42.391415\-071.570419" +for 42.391415\(de\|N, 71.570419\(de\|W, and +.B "\-c\ +404226\-0740319" +for 48\(de\|42\(fm\|26\(sd\|N, 74\(de\|3\(fm\|19\(sd\|W. +.TP +.BI "\-n " limit +When +.B \-c +is used, display the closest +.I limit +locations (default 10). +.TP .BI "\-t " zonetabtype Make selections from the time zone table of type .IR zonetabtype . diff --git a/tzselect.ksh b/tzselect.ksh index dc9f256..da6adb2 100644 --- a/tzselect.ksh +++ b/tzselect.ksh @@ -1,5 +1,7 @@ #!/bin/bash +export LC_ALL=C + PKGVERSION='(tzcode) ' TZVERSION=see_Makefile REPORT_BUGS_TO=tz@iana.org @@ -41,15 +43,43 @@ ZONETABTYPE=zone exit 1 } -usage="Usage: tzselect [--version] [--help] [-t ZONETABTYPE] +coord= +location_limit=10 + +usage="Usage: tzselect [--version] [--help] [-c COORD] [-n LIMIT] [-t ZONETABTYPE] Select a time zone interactively. -ZONETABTYPE should be one of 'time' or 'zone'. + +Options: + + -c COORD + Instead of asking for continent and then country and then city, + ask for selection from time zones whose largest cities + are closest to the location with geographical coordinates COORD. + COORD should use ISO 6709 notation, for example, '-c +4852+00220' + for Paris. + + -n LIMIT + Display at most LIMIT locations when -c is used (default $location_limit). + + -t ZONETABTYPE + Use time zone table ZONETABTYPE. ZONETABTYPE should be one of + 'time' or 'zone'. + + --version + Output version information. + + --help + Output this help. Report bugs to $REPORT_BUGS_TO." -while getopts t:-: opt +while getopts c:n:t:-: opt do case $opt$OPTARG in + c*) + coord=$OPTARG ;; + n*) + location_limit=$OPTARG ;; t*) ZONETABTYPE=$OPTARG ;; -help) @@ -90,6 +120,66 @@ case $(echo 1 | (select x in x; do break; done) 2>/dev/null) in ?*) PS3= esac +# Awk script to read a time zone table and output the same table, +# with each column preceded by its distance from 'here'. +output_distances=' + BEGIN { + FS = "\t" + while (getline <TZ_COUNTRY_TABLE) + if ($0 ~ /^[^#]/) + country[$1] = $2 + country["US"] = "US" # Otherwise the strings get too long. + } + function cvt1(coord, deg, min, ilen, sign, sec) { + ilen = length(coord) + sign = substr(coord, 1, 1) + if (coord ~ /\./) { + deg = coord + 0 + } else { + if (ilen <= 6) { + sec = 0 + } else { + sec = sign substr(coord, ilen - 1) + ilen -= 2 + } + min = sign substr(coord, ilen - 1, 2) + deg = substr(coord, 1, ilen - 2) + deg = (deg * 3600.0 + min * 60.0 + sec) / 3600.0 + } + return deg * 0.017453292519943295 + } + function convert_latitude(coord) { + match(coord, /..*[-+]/) + return cvt1(substr(coord, 1, RLENGTH - 1)) + } + function convert_longitude(coord) { + match(coord, /..*[-+]/) + return cvt1(substr(coord, RLENGTH)) + } + # Great-circle distance between points with given latitude and longitude. + # Inputs and output are in radians. This uses the great-circle special + # case of the Vicenty formula for distances on ellipsoids. + function dist(lat1, long1, lat2, long2, dlong, x, y, num, denom) { + dlong = long2 - long1 + x = cos (lat2) * sin (dlong) + y = cos (lat1) * sin (lat2) - sin (lat1) * cos (lat2) * cos (dlong) + num = sqrt (x * x + y * y) + denom = sin (lat1) * sin (lat2) + cos (lat1) * cos (lat2) * cos (dlong) + return atan2(num, denom) + } + BEGIN { + coord_lat = convert_latitude(coord) + coord_long = convert_longitude(coord) + } + /^[^#]/ { + here_lat = convert_latitude($2) + here_long = convert_longitude($2) + line = $1 "\t" $2 "\t" $3 "\t" country[$1] + if (NF == 4) + line = line " - " $4 + printf "%g\t%s\n", dist(coord_lat, coord_long, here_lat, here_long), line + } +' # Begin the main loop. We come back here if the user wants to retry. while @@ -101,10 +191,14 @@ while country= region= + case $coord in + ?*) + continent=coord;; + '') # Ask the user for continent or ocean. - echo >&2 'Please select a continent or ocean.' + echo >&2 'Please select a continent, ocean, "coord", or "TZ".' select continent in \ Africa \ @@ -117,7 +211,8 @@ while Europe \ 'Indian Ocean' \ 'Pacific Ocean' \ - 'none - I want to specify the time zone using the Posix TZ format.' + 'coord - I want to use geographical coordinates.' \ + 'TZ - I want to specify the time zone using the Posix TZ format.' do case $continent in '') @@ -130,10 +225,12 @@ while break esac done + esac + case $continent in '') exit 1;; - none) + TZ) # Ask the user for a Posix TZ string. Check that it conforms. while echo >&2 'Please enter the desired value' \ @@ -158,6 +255,45 @@ while done TZ_for_date=$TZ;; *) + case $continent in + coord) + case $coord in + '') + echo >&2 'Please enter coordinates' \ + 'in ISO 6709 notation.' + echo >&2 'For example, +4042-07403 stands for' + echo >&2 '40 degrees 42 minutes north,' \ + '74 degrees 3 minutes west.' + read coord;; + esac + distance_table=$($AWK \ + -v coord="$coord" \ + -v TZ_COUNTRY_TABLE="$TZ_COUNTRY_TABLE" \ + "$output_distances" <$TZ_ZONE_TABLE | + sort -n | + sed "${location_limit}q" + ) + regions=$(echo "$distance_table" | $AWK ' + BEGIN { FS = "\t" } + { print $NF } + ') + echo >&2 'Please select one of the following' \ + 'time zone regions,' + echo >&2 'listed roughly in increasing order' \ + "of distance from $coord". + select region in $regions + do + case $region in + '') echo >&2 'Please enter a number in range.';; + ?*) break;; + esac + done + TZ=$(echo "$distance_table" | $AWK -v region="$region" ' + BEGIN { FS="\t" } + $NF == region { print $4 } + ') + ;; + *) # Get list of names of countries in the continent or ocean. countries=$($AWK -F'\t' \ -v continent="$continent" \ @@ -185,7 +321,8 @@ while # If there's more than one country, ask the user which one. case $countries in *"$newline"*) - echo >&2 'Please select a country.' + echo >&2 'Please select a country' \ + 'whose clocks agree with yours.' select country in $countries do case $country in @@ -256,6 +393,7 @@ while } $1 == cc && $4 == region { print $3 } ' <$TZ_ZONE_TABLE) + esac # Make sure the corresponding zoneinfo file exists. TZ_for_date=$TZDIR/$TZ @@ -292,9 +430,11 @@ Universal Time is now: $UTdate." echo >&2 "" echo >&2 "The following information has been given:" echo >&2 "" - case $country+$region in - ?*+?*) echo >&2 " $country$newline $region";; - ?*+) echo >&2 " $country";; + case $country%$region%$coord in + ?*%?*%) echo >&2 " $country$newline $region";; + ?*%%) echo >&2 " $country";; + %?*%?*) echo >&2 " coord $coord$newline $region";; + %%?*) echo >&2 " coord $coord";; +) echo >&2 " TZ='$TZ'" esac echo >&2 "" @@ -313,7 +453,7 @@ Universal Time is now: $UTdate." '') exit 1;; Yes) break esac -do : +do coord= done case $SHELL in -- 1.8.1.2
On Tue, 20 Aug 2013, Paul Eggert wrote:
+ -c COORD + Instead of asking for continent and then country and then city, + ask for selection from time zones whose largest cities + are closest to the location with geographical coordinates COORD. + COORD should use ISO 6709 notation, for example, '-c +4852+00220' + for Paris.
Thank you, this looks useful. However, I believe that ISO 6709 allows several variations, including: degrees only (e.g. +49+002); ... with decimal degrees (e.g. +48.85+002.35); degrees and minutes (e.g. +4851+00221); ... with decimal minutes (e.g. +4851.40+00221.05); degrees, minutes, and seconds (e.g. +485124+0022103); ... with decimal seconds (e.g. +485124.1+0022103.2); The awk code can't handle the integer degrees form, but it can handle degrees and decimal degrees. When I attempt to use integer degrees, it seems to treat it as minutes instead of degrees (try "+49+002" and it will suggest some time zones in Africa instead of in Europe, but try "+49.0+002.0" and it works). I think the code should count the digits to figure out whether the integer part is in degrees (DD or DDD), degrees and minutes (DDMM or DDDMM), or degrees, minutes and seconds (DDMMSS or DDDMMSS). Alternatively, it could accept only a subset of the possible variations, provided that is clearly documented. I believe that ISO 6709 requires exactly two digits for the degrees part of the latitude, exactly three digits for the degrees part of the longitude, and exactly two digits for any non-fractional minutes or seconds. The awk code seems to relax this, especially where there is a decimal point, and I think that's useful for the integer degrees case and the degrees with decimal degrees case (e.g. to allow "+49+2" or "+48.9+2.3" instead of "+49+002" or "+48.9+002.3"), but dangerous for cases that involve minutes or seconds, because counting the digits is necessary to disambiguate those cases. Whatever you do, I think it needs a few more examples of coordinates in different formats. --apb (Alan Barrett)
Alan Barrett wrote:
On Tue, 20 Aug 2013, Paul Eggert wrote:
+ -c COORD
Nifty. I wonder what input methods OS folks will stick on the front of that. Probably not many of the general public can reel off their geographical coordinates, numerically, to within a degree. (But then this code doesn't require the input to be accurate even as coarsely as a degree.)
Thank you, this looks useful. However, I believe that ISO 6709 allows several variations, including:
I think tzselect only needs to handle the forms actually seen in zone.tab. The format notes at the top of zone.tab explicitly limit it to two of the formats that ISO 6709 permits. (Still would be *nice* to handle all ISO 6709 cases, and, separately, even nicer to detect unsupported/invalid formats.) -zefram
Thanks for the comments. I'll submit a patch shortly to implement most of the suggestions.
so that 'tzselect -t time' doesn't create a blind alley for the Arctic. --- tzselect.ksh | 54 +++++++++++++++++++++++++++++++----------------------- 1 file changed, 31 insertions(+), 23 deletions(-) diff --git a/tzselect.ksh b/tzselect.ksh index da6adb2..b8f2bc6 100644 --- a/tzselect.ksh +++ b/tzselect.ksh @@ -200,31 +200,39 @@ while echo >&2 'Please select a continent, ocean, "coord", or "TZ".' - select continent in \ - Africa \ - Americas \ - Antarctica \ - 'Arctic Ocean' \ - Asia \ - 'Atlantic Ocean' \ - Australia \ - Europe \ - 'Indian Ocean' \ - 'Pacific Ocean' \ - 'coord - I want to use geographical coordinates.' \ - 'TZ - I want to specify the time zone using the Posix TZ format.' - do - case $continent in - '') - echo >&2 'Please enter a number in range.';; - ?*) + quoted_continents=$( + $AWK -F'\t' ' + /^[^#]/ { + entry = substr($3, 1, index($3, "/") - 1) + if (entry == "America") + entry = entry "s" + if (entry ~ /^(Arctic|Atlantic|Indian|Pacific)$/) + entry = entry " Ocean" + printf "'\''%s'\''\n", entry + } + ' $TZ_ZONE_TABLE | + sort -u | + tr '\n' ' ' + echo '' + ) + + eval ' + select continent in '"$quoted_continents"' \ + "coord - I want to use geographical coordinates." \ + "TZ - I want to specify the time zone using the Posix TZ format." + do case $continent in - Americas) continent=America;; - *' '*) continent=$(expr "$continent" : '\([^ ]*\)') + "") + echo >&2 "Please enter a number in range.";; + ?*) + case $continent in + Americas) continent=America;; + *" "*) continent=$(expr "$continent" : '\''\([^ ]*\)'\'') + esac + break esac - break - esac - done + done + ' esac case $continent in -- 1.8.1.2
* tzselect.ksh (LC_ALL): Don't set this, so that the user can use the locale's decimal point in coordinates. (convert_coord): Rename from cvt1. All callers changed. Support more ISO 6709 forms. * tzselect.8: Document -c better, including the new forms. --- tzselect.8 | 43 +++++++++++++++++++++++++++++++++++-------- tzselect.ksh | 39 ++++++++++++++++++--------------------- 2 files changed, 53 insertions(+), 29 deletions(-) diff --git a/tzselect.8 b/tzselect.8 index 884af36..39436ae 100644 --- a/tzselect.8 +++ b/tzselect.8 @@ -31,14 +31,41 @@ All interaction with the user is done via standard input and standard error. Instead of asking for continent and then country and then city, ask for selection from time zones whose largest cities are closest to the location with geographical coordinates -.IR coord . +.I coord. Use ISO 6709 notation for -.IR coord , -for example, -.B "\-c\ \+42.391415\-071.570419" -for 42.391415\(de\|N, 71.570419\(de\|W, and -.B "\-c\ +404226\-0740319" -for 48\(de\|42\(fm\|26\(sd\|N, 74\(de\|3\(fm\|19\(sd\|W. +.I coord, +that is, a latitude immediately followed by a longitude. The latitude +and longitude should be signed integers followed by an optional +decimal point and fraction: positive numbers represent north and east, +negative south and west. Latitudes with two and longitudes with three +integer digits are treated as degrees; latitudes with four or six and +longitudes with five or seven integer digits are treated as +.I "DDMM, DDDMM, DDMMSS," +or +.I DDDMMSS +representing +.I DD +or +.I DDD +degrees, +.I MM +minutes, +and zero or +.I SS +seconds, with any trailing fractions represent fractional minutes or +(if +.I SS +is present) seconds. The decimal point is that of the current locale. +For example, in the (default) C locale, +.B "\-c\ +40.689\-074.045" +specifies 40.689\(de\|N, 74.045\(de\|W, +.B "\-c\ +4041.4\-07402.7" +specifies 40\(de\|41.4\(fm\|N, 74\(de\|2.7\(fm\|W, and +.B "\-c\ +404121\-0740240" +specifies 40\(de\|41\(fm\|21\(sd\|N, 74\(de\|2\(fm\|40\(sd\|W. +If +.I coord +is not one of the documented forms, the resulting behavior is unspecified. .TP .BI "\-n " limit When @@ -49,7 +76,7 @@ locations (default 10). .TP .BI "\-t " zonetabtype Make selections from the time zone table of type -.IR zonetabtype . +.I zonetabtype. Possible .I zonetabtype values include: diff --git a/tzselect.ksh b/tzselect.ksh index b8f2bc6..798c704 100644 --- a/tzselect.ksh +++ b/tzselect.ksh @@ -1,7 +1,5 @@ #!/bin/bash -export LC_ALL=C - PKGVERSION='(tzcode) ' TZVERSION=see_Makefile REPORT_BUGS_TO=tz@iana.org @@ -130,31 +128,30 @@ output_distances=' country[$1] = $2 country["US"] = "US" # Otherwise the strings get too long. } - function cvt1(coord, deg, min, ilen, sign, sec) { - ilen = length(coord) - sign = substr(coord, 1, 1) - if (coord ~ /\./) { - deg = coord + 0 - } else { - if (ilen <= 6) { - sec = 0 - } else { - sec = sign substr(coord, ilen - 1) - ilen -= 2 - } - min = sign substr(coord, ilen - 1, 2) - deg = substr(coord, 1, ilen - 2) - deg = (deg * 3600.0 + min * 60.0 + sec) / 3600.0 - } - return deg * 0.017453292519943295 + function convert_coord(coord, deg, min, ilen, sign, sec) { + if (coord ~ /^[-+]?[0-9]?[0-9][0-9][0-9][0-9][0-9][0-9]([^0-9]|$)/) { + degminsec = coord + intdeg = degminsec < 0 ? -int(-degminsec / 10000) : int(degminsec / 10000) + minsec = degminsec - intdeg * 10000 + intmin = minsec < 0 ? -int(-minsec / 100) : int(minsec / 100) + sec = minsec - intmin * 100 + deg = (intdeg * 3600 + intmin * 60 + sec) / 3600 + } else if (coord ~ /^[-+]?[0-9]?[0-9][0-9][0-9][0-9]([^0-9]|$)/) { + degmin = coord + intdeg = degmin < 0 ? -int(-degmin / 100) : int(degmin / 100) + min = degmin - intdeg * 100 + deg = (intdeg * 60 + min) / 60 + } else + deg = coord + return deg * 0.017453292519943296 } function convert_latitude(coord) { match(coord, /..*[-+]/) - return cvt1(substr(coord, 1, RLENGTH - 1)) + return convert_coord(substr(coord, 1, RLENGTH - 1)) } function convert_longitude(coord) { match(coord, /..*[-+]/) - return cvt1(substr(coord, RLENGTH)) + return convert_coord(substr(coord, RLENGTH)) } # Great-circle distance between points with given latitude and longitude. # Inputs and output are in radians. This uses the great-circle special -- 1.8.1.2
On Tue, 20 Aug 2013, Paul Eggert wrote:
* tzselect.ksh (LC_ALL): Don't set this, so that the user can use the locale's decimal point in coordinates. (convert_coord): Rename from cvt1. All callers changed. Support more ISO 6709 forms. * tzselect.8: Document -c better, including the new forms.
Thank you. I think that covers all the cases I know about, and the documentation in tzselect.8 is good. In the usage message in tzselect.ksh, I think it would be useful to add another example, showing the use of degrees without minutes, and the use of negative signs, something like this: COORD should use ISO 6709 notation, for example, '-c +4852+00220' for Paris (in degrees and minutes, North and East), or '-c -35-058' for Buenos Aires (in degrees, South and West). --apb (Alan Barrett)
Thanks, I pushed that into github.
participants (3)
-
Alan Barrett -
Paul Eggert -
Zefram