[PROPOSED 1/4] Allow “§” etc. in commentary
* Makefile (UNUSUAL_OK_LATIN_1): Allow all non-alphabetic, non-ASCII printable characters that are Latin-1. This is primarily for “§” and we might as well allow them all since even XEmacs 21 supports them all. --- Makefile | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/Makefile b/Makefile index 721a7452..17e3bb57 100644 --- a/Makefile +++ b/Makefile @@ -459,9 +459,10 @@ SAFE_CHARSET3= 'abcdefghijklmnopqrstuvwxyz{|}~' SAFE_CHARSET= $(SAFE_CHARSET1)$(SAFE_CHARSET2)$(SAFE_CHARSET3) SAFE_CHAR= '[]'$(SAFE_CHARSET)'-]' -# These characters are Latin-1, and so are likely to be displayable -# even in editors with limited character sets. -UNUSUAL_OK_LATIN_1 = «°±»½¾× +# These non-alphabetic, non-ASCII printable characters are Latin-1, +# and so are likely displayable even in editors like XEmacs 21 +# that have limited character sets. +UNUSUAL_OK_LATIN_1 = ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿×÷ # This IPA symbol is represented in Unicode as the composition of # U+0075 and U+032F, and U+032F is not considered alphabetic by some # grep implementations that do not grok composition. -- 2.37.2
* australasia, northamerica: Fix capitalization in commentary about US law, as the law says “Chamorro standard time” not the popular “Chamorro Standard Time”. primarily for “§” and we might as well support them all. --- australasia | 4 ++-- northamerica | 5 +++-- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/australasia b/australasia index af0410ab..749b72c8 100644 --- a/australasia +++ b/australasia @@ -1808,7 +1808,7 @@ Zone Pacific/Efate 11:13:16 - LMT 1912 Jan 13 # Vila # period. It would probably be reasonable to assume Guam use GMT+9 during # that period of time like the surrounding area. -# From Paul Eggert (2018-11-18): +# From Paul Eggert (2023-01-23): # Howse writes (p 153) "The Spaniards, on the other hand, reached the # Philippines and the Ladrones from America," and implies that the Ladrones # (now called the Marianas) kept American date for quite some time. @@ -1821,7 +1821,7 @@ Zone Pacific/Efate 11:13:16 - LMT 1912 Jan 13 # Vila # they did as that avoids the need for a separate zone due to our 1970 cutoff. # # US Public Law 106-564 (2000-12-23) made UT +10 the official standard time, -# under the name "Chamorro Standard Time". There is no official abbreviation, +# under the name "Chamorro standard time". There is no official abbreviation, # but Congressman Robert A. Underwood, author of the bill that became law, # wrote in a press release (2000-12-27) that he will seek the use of "ChST". diff --git a/northamerica b/northamerica index 24b68e72..be5825b8 100644 --- a/northamerica +++ b/northamerica @@ -276,9 +276,10 @@ Zone PST8PDT -8:00 US P%sT # -10 Standard Alaska Time (AST) Alaska-Hawaii standard time (AHST) # -11 (unofficial) Nome (NST) Bering standard time (BST) # -# From Paul Eggert (2000-01-08), following a heads-up from Rives McDow: -# Public law 106-564 (2000-12-23) introduced ... "Chamorro Standard Time" +# From Paul Eggert (2023-01-23), from a 2001-01-08 heads-up from Rives McDow: +# Public law 106-564 (2000-12-23) introduced "Chamorro standard time" # for time in Guam and the Northern Marianas. See the file "australasia". +# Also see 15 U.S.C. §263 <https://www.law.cornell.edu/uscode/text/15/263>. # # From Paul Eggert (2015-04-17): # HST and HDT are standardized abbreviations for Hawaii-Aleutian -- 2.37.2
* tz-how-to.html: Don’t imply that “HST” is incorrect, as it’s used in US regulations. --- tz-how-to.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tz-how-to.html b/tz-how-to.html index e1e28f2e..05013b45 100644 --- a/tz-how-to.html +++ b/tz-how-to.html @@ -548,8 +548,8 @@ a <code>SAVE</code> of zero. <ul> <li>The <a href="https://en.wikipedia.org/wiki/Tz_database">tz -database</a> gives abbreviations for time zones in <i>popular -usage</i>, which is not necessarily “correct” by law. For +database</a> gives abbreviations for time zones +in popular English-language usage. For example, the last line in <code>Zone</code> <code>Pacific/Honolulu</code> (shown below) gives “HST” for “Hawaii standard time” even though the -- 2.37.2
* zone.tab, zone1970.tab (America/Adak): Label it “Alaska - western Aleutians”, not “Aleutian Islands”, as it excludes the Aleutians that are east of 169° 30′ W. or are not part of Alaska. --- northamerica | 4 ++++ zone.tab | 2 +- zone1970.tab | 2 +- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/northamerica b/northamerica index be5825b8..13ff594a 100644 --- a/northamerica +++ b/northamerica @@ -668,6 +668,10 @@ Zone America/Los_Angeles -7:52:58 - LMT 1883 Nov 18 20:00u # So they won't be waiting for Alaska to join them on 2019-03-10, but will # rather change their clocks twice in seven weeks. +# From Paul Eggert (2023-01-23): +# America/Adak is for the Aleutian Islands that are part of Alaska +# and are west of 169.5° W. + # Zone NAME STDOFF RULES FORMAT [UNTIL] Zone America/Juneau 15:02:19 - LMT 1867 Oct 19 15:33:32 -8:57:41 - LMT 1900 Aug 20 12:00 diff --git a/zone.tab b/zone.tab index f968a342..dbcb6179 100644 --- a/zone.tab +++ b/zone.tab @@ -427,7 +427,7 @@ US +571035-1351807 America/Sitka Alaska - Sitka area US +550737-1313435 America/Metlakatla Alaska - Annette Island US +593249-1394338 America/Yakutat Alaska - Yakutat US +643004-1652423 America/Nome Alaska (west) -US +515248-1763929 America/Adak Aleutian Islands +US +515248-1763929 America/Adak Alaska - western Aleutians US +211825-1575130 Pacific/Honolulu Hawaii UY -345433-0561245 America/Montevideo UZ +3940+06648 Asia/Samarkand Uzbekistan (west) diff --git a/zone1970.tab b/zone1970.tab index d6932b80..1f1cecb8 100644 --- a/zone1970.tab +++ b/zone1970.tab @@ -338,7 +338,7 @@ US +571035-1351807 America/Sitka Alaska - Sitka area US +550737-1313435 America/Metlakatla Alaska - Annette Island US +593249-1394338 America/Yakutat Alaska - Yakutat US +643004-1652423 America/Nome Alaska (west) -US +515248-1763929 America/Adak Aleutian Islands +US +515248-1763929 America/Adak Alaska - western Aleutians US +211825-1575130 Pacific/Honolulu Hawaii UY -345433-0561245 America/Montevideo UZ +3940+06648 Asia/Samarkand Uzbekistan (west) -- 2.37.2
On 1/23/23 13:48:02, Paul Eggert via tz wrote:
* Makefile (UNUSUAL_OK_LATIN_1): Allow all non-alphabetic, non-ASCII printable characters that are Latin-1. This is primarily for “§” and we might as well allow them all since even XEmacs 21 supports them all.
Ouch! UTF-8 is too pervasive on desktops and WWW for that to be comfortable. And on a UTF-8 desktop, GNU sed strangles on non-UTF-8 strings: 1250 $ printf 'a\xa7b\n' | sed -E 's/(.)(.)(.)/1 \1 2 \2 3 \3/' sed: RE error: illegal byte sequence 1251 $ -- gil
On Mon, 2023-01-23 at 15:28 -0700, Paul Gilmartin via tz wrote:
On 1/23/23 13:48:02, Paul Eggert via tz wrote:
* Makefile (UNUSUAL_OK_LATIN_1): Allow all non-alphabetic, non-ASCII printable characters that are Latin-1. This is primarily for “§” and we might as well allow them all since even XEmacs 21 supports them all.
Ouch! UTF-8 is too pervasive on desktops and WWW for that to be comfortable.
And on a UTF-8 desktop, GNU sed strangles on non-UTF-8 strings: 1250 $ printf 'a\xa7b\n' | sed -E 's/(.)(.)(.)/1 \1 2 \2 3 \3/' sed: RE error: illegal byte sequence 1251 $
I think the intent is to allow non-ASCII characters that are in Latin- 1, even though the file is coded in UTF-8. That is, not all Unicode characters are allowed, only those that appear in Latin-1. John Sauter (John_Sauter@systemeyescomputerstore.com) -- get my PGP public key with gpg --locate-external-keys John_Sauter@systemeyescomputerstore.com
On 2023-01-23 15:32, John Sauter via tz wrote:
On Mon, 2023-01-23 at 15:28 -0700, Paul Gilmartin via tz wrote:
On 1/23/23 13:48:02, Paul Eggert via tz wrote:
* Makefile (UNUSUAL_OK_LATIN_1): Allow all non-alphabetic, non-ASCII printable characters that are Latin-1. This is primarily for “§” and we might as well allow them all since even XEmacs 21 supports them all.
+UNUSUAL_OK_LATIN_1 = ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿×÷
Ouch! UTF-8 is too pervasive on desktops and WWW for that to be comfortable.
And on a UTF-8 desktop, GNU sed strangles on non-UTF-8 strings: 1250 $ printf 'a\xa7b\n' | sed -E 's/(.)(.)(.)/1 \1 2 \2 3 \3/' sed: RE error: illegal byte sequence 1251 $
I think the intent is to allow non-ASCII characters that are in Latin- 1, even though the file is coded in UTF-8. That is, not all Unicode characters are allowed, only those that appear in Latin-1.
Nitpick - ordinal indicators are Letters other like non-Latin scripts and micro sign is lowercase like Western scripts so match [[:alpha:]] not [[:punct:]]: $ man iso-8859-1 | grep '\s[[:alpha:]]\s' | head -3 252 170 AA ª FEMININE ORDINAL INDICATOR 265 181 B5 µ MICRO SIGN 272 186 BA º MASCULINE ORDINAL INDICATOR $ grep -ah 'ORDINAL\|MICRO SIGN' unicode-symbols.txt \ unicode/15.0.0/ucd/UnicodeData.txt ª U+00AA FEMININE ORDINAL INDICATOR µ U+00B5 MICRO SIGN º U+00BA MASCULINE ORDINAL INDICATOR 00AA;FEMININE ORDINAL INDICATOR;Lo;0;L;<super> 0061;;;;N;;;;; 00B5;MICRO SIGN;Ll;0;L;<compat> 03BC;;;;N;;;039C;;039C 00BA;MASCULINE ORDINAL INDICATOR;Lo;0;L;<super> 006F;;;;N;;;;; -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry
participants (4)
-
Brian Inglis -
John Sauter -
Paul Eggert -
Paul Gilmartin