Re: [ietf-charsets] [art] US-ASCII and its various names
Hello John, others, [Removing iana@iana.org, because this doesn't contain any requests for them.] On 2023-12-18 12:57, John C Klensin wrote:
--On Monday, 18 December, 2023 09:21 +0900 "Martin J. Dürst" <duerst@it.aoyama.ac.jp> wrote:
Hello Stephen,
On 2023-12-16 04:06, Steffen Nurpmeso wrote:
To add that for backward compatibility the plain ASCII alias cannot go away,
I seem to remember too that ASCII was listed as an alias, and have confirmed this with https://web.archive.org/web/20051229042158/http://www.iana.org /assignments/character-sets
That article appears to be to be discussing MIBs, not charset parameters for what are now called Media Types, particularly text and its subtypes.
Sorry, wrong. Yes, it *also* talks about MIBs, but that's just because the MIB stuff is integrated in the registry. As for the many contributions of historical background, that's very much appreciated. I definitely haven't been around charsets when RFC 20 was created, although my experience with charsets goes back quite a bit longer than my role as expert reviewer. My summary of the history as relevant for the problem at hand (probably a clerical miss) is as follows: - Around 1969 (RFC 20, "ASCII format for Network Interchange", Oct 16), the issue was more about ASCII vs. EBCDIC (IBM) vs. encodings from other vendors that nobody remembers these days. The ARPANET was just being started (The first ARPANET communication is dated 22:30 hours on October 29, 1969, California time). This was a purely US-only undertaking, and nobody in that undertaking was worrying too much (if at all) about encodings for languages other than English. [RFC 20 is STD 80 now, but that designation must have happened in or close to 2014, because STD 79/RFC 7296 dates from Oct. 2014.] - For quite some time before and around 2000 (RFC 2978, "IANA Charset Registration Procedures"; RFC 2278, same name, Jan 1998 (*)), the name "ASCII" was in various contexts used with somewhat different, meanings, to the extent that people involved found it very prudent to strongly insist on using the label "US-ASCII" for "the real thing", while still listing the label "ASCII" as an alias. - Somewhere around 2013, the "ASCII" alias got dropped from the registry, probably as a result of a clerical error (and not related to the fact that "ASCII" was essentially deprecated for a long time although 'deprecated' isn't used in the registry). - In this day and age, most of the 'ASCII-like' character encodings are virtually out of use. "ASCII" as a label is also used extremely rarely in contexts where the IANA charset registry is relevant (email and the Web definitely count, OSes and programming languages don't). UTF-8 is recommended and used widely, and elegantly subsumes (US-)ASCII.
Stephen, maybe you can do a bisection to find out where this alias disappeared.
Oh, at the risk of repeating parts of the note I sent some days ago, let me tell you what I remember before I co-wrote the document that created the charset review role:
(*) Can you tell us about that document? RFC 2048 (MIME Registration Procedures, which you co-wrote) explicitly says "Registration of character sets for use in MIME is covered elsewhere and is no longer addressed by this document." I didn't find anything about registration in the RFC 2045~49 series nor in RFC 1522 or 1590, but maybe I didn't look hard enough.
While it seems odd to not have it among the collection of synonyms, is isn't there (and was probably dropped from various earlier specs) because of the potential for ambiguity.
If you are going to add it to the registry, that should be done, not as a synonym for "US-ASCII", but as a separate items with an explanation of the ambiguity problem.
It was there as a synonym, and I'm not going to ask IANA to add it separately.
<charset reviewer hat on> I don't remember ever having dealt with a request to remove this ALIAS, and I strongly doubt that Ned ever did that. </charset reviewer hat on>
I don't believe it was ever listed as a valid charset value, regardless of other uses. As I said, not having it there was a very explicit decision.
It was there up to around 2013, and that must have been an explicit decision, the same way that designating US-ASCII as the preferred MIME label was a very explicit decision. Regards, Martin.
participants (1)
-
Martin J. Dürst