Proposal to deprecate "." and "-" in zone names
CURRENT ftp://ftp.iana.org/tz/code/Theory says "Within a file name component, use only ASCII letters, `.', `-' and `_'." In ftp://ftp.iana.org/tz/data/zone.tab I found none having "." I found three having "-": America/Blanc-Sablon Africa/Porto-Novo America/Port-au-Prince PROPOSED Change ftp://ftp.iana.org/tz/code/Theory "Within a file name component, use only ASCII letters and `_'." REASONING @"." "." is not used, document current usage. Reduce number of possible variations. @"-" Reduce number of possible variations. Avoid discussions about names as happened with the zone proposal for the Crozet Islands "Alfred-Faure" vs "Alfred Faure". @both Some environments don't allow "/". Since "/" is part of the zone name, a replacement may be needed. If "." and "-" are not allowed for zone names anymore, then they can be used for replacing "/". -- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com/
Tobias Conradi said:
"." is not used, document current usage. Reduce number of possible variations.
What about places that have a dot in their official name?
Reduce number of possible variations.
What about places that have a hyphen in their official name?
Avoid discussions about names as happened with the zone proposal for the Crozet Islands "Alfred-Faure" vs "Alfred Faure".
That's a discussion about which is the official name. If a zone was ever needed for Leigh-on-Sea, then "Leigh on Sea" would be just plain wrong, as would "Leigh" (which is a different place). -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On Thu, May 24, 2012 at 1:50 PM, Clive D.W. Feather <clive@davros.org> wrote:
Tobias Conradi said:
"." is not used, document current usage. Reduce number of possible variations.
What about places that have a dot in their official name? Dot would be removed or replaced with "_" and double "_" would be reduced to single "_"
Already now names are changed to fit into the naming restrictions, Bahia de Banderas got a whole word (de) removed, others loose the apostrophe.
Reduce number of possible variations.
What about places that have a hyphen in their official name? Hyphen would be replaced with "_"
Avoid discussions about names as happened with the zone proposal for the Crozet Islands "Alfred-Faure" vs "Alfred Faure".
That's a discussion about which is the official name. Exactly, and it would be avoided by following the proposal and deprecating "." and "-"
If a zone was ever needed for Leigh-on-Sea, then "Leigh on Sea" would be just plain wrong, as would "Leigh" (which is a different place). It wouldn't be wrong. Bahia_Banderas is not wrong, despite the place's name being "Bahia de Banderas".
The IANA time zone database can define zone names and then by definition if they follow the rules they are not wrong. -- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com/
Tobias Conradi wrote:
"." is not used, document current usage. Reduce number of possible variations.
Maybe. I'm neutral on this.
@"-" Reduce number of possible variations. Avoid discussions about names as
Seems better not to change the existing ones, and it's rare that we'd be unclear on whether dash or space is the correct spelling. So opposed to this.
Some environments don't allow "/". Since "/" is part of the zone name, a replacement may be needed.
"/" is part of the zone name regardless of platform. When mapping a zone name to a pathname, "/" translates to the directory separator, whatever that is on each platform. If you really want to use a complete zone name as a single filename (i.e., a name that can appear in a single-level directory), then you need to do some mapping, but it's not our job to cater to that usage. -zefram
On Thu, May 24, 2012 at 2:33 PM, Zefram <zefram@fysh.org> wrote:
Tobias Conradi wrote:
@"-" Reduce number of possible variations. Avoid discussions about names as
Seems better not to change the existing ones It would be only three extra entries in the backward file.
Some environments don't allow "/". Since "/" is part of the zone name, a replacement may be needed.
If you really want to use a complete zone name as a single filename (i.e., a name that can appear in a single-level directory), then you need to do some mapping, but it's not our job to cater to that usage. It doesn't matter whether it is anyone's job.
I am here to improve usability of the IANA time zone database. Other people already invent new names because in their systems the IANA time zone names are not sufficient. E.g. CLDR has created UN/LOCODE based codes: http://cldr.unicode.org/development/development-process/design-proposals/bcp... Some of these insufficiencies can be easily removed. http://www.w3.org/TR/2008/REC-CSS2-20080411/syndata.html#q4 says "In CSS2, identifiers (including element names, classes, and IDs in selectors) can contain only the characters [A-Za-z0-9] and ISO 10646 characters 161 and higher, plus the hyphen (-); they cannot start with a hyphen or a digit. They can also contain escaped characters and any ISO 10646 character as a numeric code (see next item). For instance, the identifier "B&W?" may be written as "B\&W\?" or "B\26 W\3F"." To have / or \ in zone names needs escaping in CSS2 according to the above, if one wants to use the names for classes or ids in CSS. By using only [A-Za-z0-9] in name components and "/" as component separator, the IANA time zone database would allow easy transformation from IANA time zone names into class names or ids and backward. -- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com/
On 25/05/12 00:27, Tobias Conradi wrote:
I am here to improve usability of the IANA time zone database.
Other people already invent new names because in their systems the IANA time zone names are not sufficient.
E.g. CLDR has created UN/LOCODE based codes: http://cldr.unicode.org/development/development-process/design-proposals/bcp...
Some of these insufficiencies can be easily removed.
http://www.w3.org/TR/2008/REC-CSS2-20080411/syndata.html#q4 says "In CSS2, identifiers (including element names, classes, and IDs in selectors) can contain only the characters [A-Za-z0-9] and ISO 10646 characters 161 and higher, plus the hyphen (-); they cannot start with a hyphen or a digit. They can also contain escaped characters and any ISO 10646 character as a numeric code (see next item). For instance, the identifier "B&W?" may be written as "B\&W\?" or "B\26 W\3F"."
To have / or \ in zone names needs escaping in CSS2 according to the above, if one wants to use the names for classes or ids in CSS.
I don't know why you'd want to convert timezone IDs to CSS2 IDs, but surely you'd have to escape the underscores as well.
By using only [A-Za-z0-9] in name components and "/" as component separator, the IANA time zone database would allow easy transformation from IANA time zone names into class names or ids and backward.
A time-zone name would map to an instance of a time-zone class, not to a class itself, although you may want to map a time-zone name to a variable name or some other identifier such as an enumerated constant. Different systems have different restrictions, but you can always define an escape mechanism to perform a two-way mapping between the system's ID naming rules and tzdata time-zone names. Or you could use the time-zone name as a string-type look-up key into a database, which is effectively what tzcode does already. -- -=( Ian Abbott @ MEV Ltd. E-mail: <abbotti@mev.co.uk> )=- -=( Tel: +44 (0)161 477 1898 FAX: +44 (0)161 718 3587 )=-
On Fri, May 25, 2012 at 10:01 AM, Ian Abbott <abbotti@mev.co.uk> wrote:
On 25/05/12 00:27, Tobias Conradi wrote:
http://www.w3.org/TR/2008/REC-CSS2-20080411/syndata.html#q4 says "In CSS2, identifiers (including element names, classes, and IDs in selectors) can contain only the characters [A-Za-z0-9] and ISO 10646 characters 161 and higher, plus the hyphen (-); they cannot start with a hyphen or a digit. They can also contain escaped characters and any ISO 10646 character as a numeric code (see next item). For instance, the identifier "B&W?" may be written as "B\&W\?" or "B\26 W\3F"."
To have / or \ in zone names needs escaping in CSS2 according to the above, if one wants to use the names for classes or ids in CSS.
I don't know why you'd want to convert timezone IDs to CSS2 IDs, but surely you'd have to escape the underscores as well. One could draw maps, and have the coloring be done by CSS. I don't know about restrictions for SVG ids. But at least a legend could be in HTML and use CSS.
If in SVG "/" is not allowed for ids, then the / is hindering map creation. There would be less trouble without the "/". If the "-" is not used in name components, at least one easy conversion to a less troublesome character would exist. -- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com/
Tobias Conradi said:
If in SVG "/" is not allowed for ids, then the / is hindering map creation.
Rubbish. Doing a conversion is trivial for any competent programmer. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On Sat, May 26, 2012 at 1:27 PM, Clive D.W. Feather <clive@davros.org> wrote:
Tobias Conradi said:
If in SVG "/" is not allowed for ids, then the / is hindering map creation.
Rubbish. Doing a conversion is trivial for any competent programmer.
My claim didn't use "competent programmer" nor did it use "conversion". One could extend: If in SVG "/" is not allowed for ids, then the / is hindering map creation, because one would need conversion or mapping. -- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com/
Tobias Conradi said:
If in SVG "/" is not allowed for ids, then the / is hindering map creation, because one would need conversion or mapping.
Compared with the effort of creating the SVG in the first place, the mapping of / to something else is trivial. Please stop wasting our time. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On Sat, May 26, 2012 at 10:06 PM, Clive D.W. Feather <clive@davros.org> wrote:
Tobias Conradi said:
If in SVG "/" is not allowed for ids, then the / is hindering map creation, because one would need conversion or mapping.
Compared with the effort of creating the SVG in the first place, the mapping of / to something else is trivial. Even if, that doesn't invalidate the claim made.
Please stop wasting our time. By correcting your statements made to the list? Maybe then you don't make such statements to the list in the first place?
-- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com/
Tobias Conradi wrote:
By using only [A-Za-z0-9] in name components and "/" as component separator, the IANA time zone database would allow easy transformation from IANA time zone names into class names or ids and backward.
Currently having increasing fun with an insane attempt to restrict names to some simple 'English' subset's of characters in a couple of other areas, I think the discussion should be on ALLOWING the full Unicode character set. This is the 21st century and many more users do not even speak English! And I am English and only speak English ... Personally I could make a case for using ISO 3166 country codes and perhaps augmenting that with an area code, so that in addition to English versions of zone names, local translation tables could be introduced in a standard way? But this probably a separate project to the time zone data :(
Some environments don't allow "/". Since "/" is part of the zone name, a replacement may be needed.
The use of "/" makes perfect sense since it groups zones into natural folders and sub-folders. It's up to you code to decide what to do with it. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk// Firebird - http://www.firebirdsql.org/index.php
On Fri, May 25, 2012 at 11:09 AM, Lester Caine <lester@lsces.co.uk> wrote:
Tobias Conradi wrote:
By using only [A-Za-z0-9] in name components and "/" as component separator, the IANA time zone database would allow easy transformation from IANA time zone names into class names or ids and backward.
Currently having increasing fun with an insane attempt to restrict names to some simple 'English' subset's of characters in a couple of other areas, I think the discussion should be on ALLOWING the full Unicode character set. This is the 21st century and many more users do not even speak English! And many more do every day. But it doesn't matter. What matters is if people can type. Many people cannot type the full range of UNICODE easily. But everyone writing HTML, C, Java, python can write [A-Za-z0-9._-]
And I am English and only speak English ... I don't speak only English, but I speak almost no Indonesian or Afrikaans, still I can type it easily, since both only use the ISO basic Latin alphabet AFAIK.
More at: http://en.wikipedia.org/wiki/ISO_basic_Latin_alphabet#Equivalent_alphabets
Personally I could make a case for using ISO 3166 country codes and perhaps augmenting that with an area code, so that in addition to English versions of zone names, local translation tables could be introduced in a standard way? But this probably a separate project to the time zone data :( CLDR uses UN/LOCODEs which have the ISO 3166-1 alpha-2 country code as the first two characters. But country codes can change, so this would lead either to divergence of the codes as CLDR does, or to zone name change caused by ISO 3166-1 alpha-2 changes
Some environments don't allow "/". Since "/" is part of the zone name, a replacement may be needed.
The use of "/" makes perfect sense since it groups zones into natural folders and sub-folders. It's up to you code to decide what to do with it.
That folders are natural is something I have never heard before. -- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com/
Tobias Conradi said:
And I am English and only speak English ... I don't speak only English, but I speak almost no Indonesian or Afrikaans, still I can type it easily, since both only use the ISO basic Latin alphabet AFAIK.
Wrong. English certainly doesn't use only the basic Latin alphabet, though American might. I believe neither Indonesian nor Afrikaans are so limited either. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On Sat, May 26, 2012 at 1:26 PM, Clive D.W. Feather <clive@davros.org> wrote:
Tobias Conradi said:
And I am English and only speak English ... I don't speak only English, but I speak almost no Indonesian or Afrikaans, still I can type it easily, since both only use the ISO basic Latin alphabet AFAIK.
Wrong. English certainly doesn't use only the basic Latin alphabet, If so, then Clive D.W. Feather Thu May 24, 2012 http://mm.icann.org/pipermail/tz/2012-May/017938.html "What about places that have a dot in their official name?" "What about places that have a hyphen in their official name?"
didn't point to the fact that deprecating dot and hyphen would change the system from being able to represent all characters potentially needed for displaying official names to a system that cannot.
though American might. I believe neither Indonesian nor Afrikaans are so limited either. AFAICS the words in http://kompas.com/ (Indonesian newspaper) only use [A-Za-z].
http://www.beeld.com/ (Afrikaans website) mostly use [A-Za-z], but not exclusively. Thanks for questioning my statement and letting me check. Also http://en.wikipedia.org/wiki/ISO_basic_Latin_alphabet says "Afrikaans alphabet: uses diacritics." - So completely my error. For http://www.bangkok.itgo.com/ (Thai website) as of 2012-05-26 I see 80% of the characters as question marks in my Chrome browser. I change
I don't speak only English, but I speak almost no Indonesian or Afrikaans, still I can type it easily, since both only use the ISO basic Latin alphabet AFAIK. to I don't speak only English, but I speak almost no Indonesian, still I can type it easily, since it only uses the ISO basic Latin alphabet AFAIK.
So I would uphold the claim, that one can type text in a language without speaking it and I think using full Unicode set for time zone names would prevent lot of people from easily reproducing them. -- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com/
Tobias Conradi <tobias.conradi@gmail.com> wrote: |http://www.w3.org/TR/2008/REC-CSS2-20080411/syndata.html#q4 says |"In CSS2, identifiers (including element names, classes, and IDs in |selectors) can contain only the characters [A-Za-z0-9] and ISO 10646 |characters 161 and higher, plus the hyphen (-); they cannot start with |a hyphen or a digit. They can also contain escaped characters and any |ISO 10646 character as a numeric code (see next item). For instance, |the identifier "B&W?" may be written as "B\&W\?" or "B\26 W\3F"." | |To have / or \ in zone names needs escaping in CSS2 according to the |above, if one wants to use the names for classes or ids in CSS. | |By using only [A-Za-z0-9] in name components and "/" as component |separator, the IANA time zone database would allow easy transformation |from IANA time zone names into class names or ids and backward. I referred to the portable character set of The Open Group Base Specifications Issue 7 / IEEE Std 1003.1™-2008 (http://pubs.opengroup.org/onlinepubs/9699919799/): Base Definitions 3. Definitions 3.170 Filename A name consisting of 1 to {NAME_MAX} bytes used to name a file. The characters composing the name may be selected from the set of all character values excluding the <slash> character and the null byte. The filenames dot and dot-dot have special meaning. A filename is sometimes referred to as a "pathname component". 3.276 Portable Filename Character Set A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 . _ - The last three characters are the <period>, <underscore>, and <hyphen> characters, respectively. 4. General Concepts 4.6 Filenames 4.7 Filename Portability 4.12 Pathname Resolution 13. Headers limits.h - implementation-defined constants {NAME_MAX} Maximum number of bytes in a filename (not including the terminating null). Minimum Acceptable Value: {_POSIX_NAME_MAX} {_POSIX_NAME_MAX} Maximum number of bytes in a filename (not including the terminating null). Value: 14 In this regard there was an interesting thread in the Austin group: https://www.opengroup.org/sophocles/show_mail.tpl?CALLER=show_archive.tpl&so... As has already been noted, any attempt to make <slash> and NULL valid characters in a filename will never be accepted as a change to the POSIX standards and the Single UNIX Specifications. That part of this bug is rejected. The other part of this bug is the handling of some characters in filenames. --steffen Forza Figa!
On Fri, May 25, 2012 at 1:04 PM, Steffen Daode Nurpmeso <sdaoden@googlemail.com> wrote:
Tobias Conradi <tobias.conradi@gmail.com> wrote:
I referred to the portable character set of The Open Group Base Specifications Issue 7 / IEEE Std 1003.1™-2008 (http://pubs.opengroup.org/onlinepubs/9699919799/):
Base Definitions 3. Definitions 3.170 Filename A name consisting of 1 to {NAME_MAX} bytes used to name a file. The characters composing the name may be selected from the set of all character values excluding the <slash> character and the null byte. The filenames dot and dot-dot have special meaning. A filename is sometimes referred to as a "pathname component".
3.276 Portable Filename Character Set
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 . _ -
The last three characters are the <period>, <underscore>, and <hyphen> characters, respectively.
4. General Concepts 4.6 Filenames 4.7 Filename Portability 4.12 Pathname Resolution
13. Headers limits.h - implementation-defined constants {NAME_MAX} Maximum number of bytes in a filename (not including the terminating null). Minimum Acceptable Value: {_POSIX_NAME_MAX} {_POSIX_NAME_MAX} Maximum number of bytes in a filename (not including the terminating null). Value: 14
Thanks a lot! Deprecating "-" would allow an easy reversible conversion of "/" into "-" in environments that only can use characters from "3.276 Portable Filename Character Set". -- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com/
On 05/24/2012 04:27 PM, Tobias Conradi wrote:
It doesn't matter whether it is anyone's job.
I am here to improve usability of the IANA time zone database.
Other people already invent new names because in their systems the IANA time zone names are not sufficient.
E.g. CLDR has created UN/LOCODE based codes: http://cldr.unicode.org/development/development-process/design-proposals/bcp...
This a draft design document proposing what is now the 'tz' field of IETF BCP 47 Extension U < https://tools.ietf.org/html/rfc6067> referencing UTS35 and data < http://unicode.org/reports/tr35/#Unicode_Language_and_Locale_Identifiers>. However, it is not relevant to simply the discussion of "." and "-", instead, the UN/LOCODE based codes (with some additions) were needed due to length restrictions. The characters used are well-specified, I don't see any benefits to deprecating "." and "-". Steven ICU / CLDR Project
participants (7)
-
Clive D.W. Feather -
Ian Abbott -
Lester Caine -
Steffen Daode Nurpmeso -
Steven R. Loomis -
Tobias Conradi -
Zefram