Fwd: [IANA #1297322] IANA characters-sets US-ASCII entry incorrect
To everybody interested in the recent discussion on the character set registry, in particular the (absence of the) entry "ASCII". Many thanks to IANA, and in particular Sabrina Tanamal, for digging up the relevant correspondence from 10 and 20 years ago. Please accept my apology for not remembering this correspondence and therewith seriously confusing the discussion. My summary based on this new information is as follows: - Ned Freed sent a request to IANA in February 2003 concerning the entry for "US-ASCII" and related aliases in the Character Set registry, requesting (among else) the removal of the alias "ASCII". The request for this removal was based on the fact that RFC 2046 says 'The character set name "ASCII" is reserved and must not be used for any purpose.'. - When IANA was moving their registries from .txt to .xml, this request was rediscovered and acted upon. Both Ned and me agreed with the removal of "ASCII". We decided that there was no need to inform the ietf-charset mailing list, which in hindsight was probably a mistake (not the least because it would have had the potential to shorten the current discussion by quite a bit). Given the fact that RFC 2046 clearly says that 'The character set name "ASCII" is reserved and must not be used for any purpose.', I think that the only choice is to leave the registry as it is. <charset reviewer hat on> I'm of course ready to reevaluate this and adding this label back in if anybody is able to come up with really strong and convincing arguments to do so. <charset reviewer hat off> For data labeled with charset=ASCII, the correct interpretation is to ignore the charset parameter because of an undefined parameter value. The implementation would then fall back to the default, which in case of email and text/plain is "US-ASCII". The overall result is the same as an "ASCII" alias in the registry. Regards, Martin. -------- Forwarded Message -------- Subject: [IANA #1297322] IANA characters-sets US-ASCII entry incorrect Date: Tue, 19 Dec 2023 01:30:10 +0000 From: Sabrina Tanamal via RT <iana-issues-comment@iana.org> Reply-To: iana-issues-comment@iana.org CC: duerst@it.aoyama.ac.jp Hi Martin (trimming the list), It looks like this change was completed in 2013 (reported by Ned and approved by you). Please see the thread below. Let me know if you need us to forward anything to the list or if any changes are required. Thanks, Sabrina ===== Fri Jan 04 08:10:25 2013 Martin Duerst <duerst@it.aoyama.ac.jp> - Correspondence added CC: ned.freed@mrochek.com Subject: Re: [IANA #111894] Possible update to character-sets (#2) Date: Fri, 04 Jan 2013 16:48:30 +0900 To: iana-matrix@iana.org From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp> Hello Amanda, On 2013/01/04 10:37, Amanda Baber via RT wrote: Hide quoted text
Hi Martin,
Are you OK with going ahead with this?
Yes, please go ahead with this. Ned is correct on each and every point. Sorry for the delay in answering this. Regards, Martin.
thanks, Amanda
On Thu Dec 20 23:11:34 2012, ned.freed@mrochek.com wrote:
Hi,
I haven't seen anything about this proposal on the charset list:
I don't see any need to post this, but since I'm the source of the change it should be Martin's call. I think he's OK with going ahead, but he should probably confirm.
Ned
I therefore suggest that this entry [ANSI_X3.4-1968] be changed to read:
Name: US-ASCII (preferred MIME name) [RFC2046] MIBenum: 3 Source: ANSI X3.4-1986 Alias: iso-ir-6 Alias: ANSI_X3.4-1968 Alias: ANSI_X3.4-1986 Alias: ISO_646.irv:1991 Alias: ISO646-US Alias: us Alias: IBM367 Alias: cp367 Alias: csASCII
Since you both approved it, as Martin noted, should we go ahead and make this change? Or would one of you prefer to post it to the list first? We'd prefer to leave it up to you.
thanks, Amanda
On Fri Sep 28 06:42:32 2012, duerst@it.aoyama.ac.jp wrote:
Hello Amanda,
I have looked at the proposed change, and I think it makes *a lot* of sense. I think we have two choices:
a) Just go ahead and fix it. We have both reviewers agreeing with it. b) Just to be sure, send a mail to the charset mailing list saying we plan to do this, so that anybody who may have a complaint (I don't expect anybody, but anyway). Then after a few weeks, go ahead and do it.
I'm okay with either.
Regards, Martin.
On 2012/09/28 7:23, Amanda Baber via RT wrote: Martin and Ned,
We have this one last character-sets email from several years ago that needs to be addressed. Ned, I know you submitted this in the first place, but can you verify that this still needs to be done/won't break anything?
all apologies, and thanks, Amanda
-------- Original Message -------- Subject: Fix for one particularly serious charset registry problem Date: Sun, 09 Feb 2003 15:51:23 -0800 (PST) From: ned.freed@mrochek.com To: iana@iana.org CC: ned.freed@mrochek.com, paf@cisco.com, harald@alvestrand.no
The first entry in the charset registry reads as follows:
Name: ANSI_X3.4-1968 [RFC1345,KXS2] MIBenum: 3 Source: ECMA registry Alias: iso-ir-6 Alias: ANSI_X3.4-1986 Alias: ISO_646.irv:1991 Alias: ASCII Alias: ISO646-US Alias: US-ASCII (preferred MIME name) Alias: us Alias: IBM367 Alias: cp367 Alias: csASCII
There are, unfortunately, many problems with this entry:
(1) The primary name of the charset is US-ASCII, not ANSI_X3.4-1968. It has always been this way as far back as RFC 1341. And this actually matters, since the primary charset name is the one that's supposed to be used in encoded words, and the restrictions on encoded words don't allow use of ANSI_X3.4-1968!
(2) The source of the registration should be the ANSI standards document, not some ECMA registry.
(3) The alias "ASCII" is specifically prohibited by RFC 2046 section 4.1.2.
(4) The defining document for this charset is RFC 2046, not RFC 1345.
I therefore suggest that this entry be changed to read:
Name: US-ASCII (preferred MIME name) [RFC2046] MIBenum: 3 Source: ANSI X3.4-1986 Alias: iso-ir-6 Alias: ANSI_X3.4-1968 Alias: ANSI_X3.4-1986 Alias: ISO_646.irv:1991 Alias: ISO646-US Alias: us Alias: IBM367 Alias: cp367 Alias: csASCII
There is no formal procedure for fixing errors in the charset registry. However, I believe that identification of an actual problem along with signoff of the charset reviewer should be sufficient to make the change.
Ned
On Mon Dec 18 21:59:24 2023, steffen@sdaoden.eu wrote:
Martin J. Dürst wrote in <6fc74600-cff6-4751-9efa-1200a9ec5aa5@it.aoyama.ac.jp>: |Hello Stephen,
Actually Miss Jaeger named me Steve in English, because another one got Stephen earlier.
|On 2023-12-16 04:06, Steffen Nurpmeso wrote: |> To add that for backward compatibility the plain ASCII alias |> cannot go away, | |I seem to remember too that ASCII was listed as an alias, and have |confirmed this with |https://web.archive.org/web/20051229042158/http://www.iana.org/assignmen\ |ts/character-sets
My local copy is from 2011.
|Stephen, maybe you can do a bisection to find out where this alias |disappeared.
I have only a local copy from 2011. I have no access to an IANA revision control system, shall such a thing exist. Sorry.
|<charset reviewer hat on> |I don't remember ever having dealt with a request to remove this ALIAS, |and I strongly doubt that Ned ever did that. |</charset reviewer hat on>
Well i could imagine it was broken when taking over the beautiful text-only version to that really, really machine readable XML variant that is now used as the base. I hope the conversion was mechanical and the bugs are little jokes, otherwise the entire IANA data collection possibly needs an audit :-)
|> i do have emails with charset=ascii in my archive, | |It would be interesting to know how many (i.e. what percentage of |overall mails, and what percentage in comparison to those labeled |US-ASCII), and how old they are.
The latter is very large, the former is very low. Two, to be exact. But they do exist (and i only have a very sparse private email archive).
--steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) | | Only in December: lightful Dubai COP28 Narendra Modi quote: | A small part of humanity has ruthlessly exploited nature. | But the entire humanity is bearing the cost of it, | especially the inhabitants of the Global South. | The selfishness of a few will lead the world into darkness, | not just for themselves but for the entire world. | [Christians might think of Revelation 11:18 | The nations were angry, and your wrath has come[.] | [.]for destroying those who destroy the earth. | But i find the above more kind, and much friendlier]
participants (1)
-
Martin J. Dürst