Dear Anthony, (cc Mr. Chen Zhuang and Ken) I find your email in the GB18030 page in IANA (https://www.iana.org/assignments/charset-reg/GB18030), which the page is compiled by you. The newest version of GB 18030 has been published as GB 18030-2022, please see https://std.samr.gov.cn/gb/search/gbDetailed?id=E4A2A4C875726A5DE05397BE0A0A... and https://openstd.samr.gov.cn/bzgk/gb/newGbInfo?hcno=A1931A578FE14957104988029... . However the IANA information has still kept for the original 2000 version, which is not better for the users. CESI also kindly released the mapping table at http://www.nits.org.cn/getIndex.req?action=findAllNews&req=modulenvpromote&t... . If it is not convenient to download the file from the website, you can find the attachment. I think it is better to update the information in corresponding IANA page. So, do you know how shall we need to do? Eiso
Dear Eiso, others, Many thanks for your request. I'm the expert reviewer for character set registrations (Ned Freed was the primary reviewer, but unfortunately no longer is with us). I'm sorry I missed your mail for over a month. On 2022-10-14 12:02, 陈永聪 wrote:
Dear Anthony, (cc Mr. Chen Zhuang and Ken)
I find your email in the GB18030 page in IANA (https://www.iana.org/assignments/charset-reg/GB18030), which the page is compiled by you.
The newest version of GB 18030 has been published as GB 18030-2022, please see https://std.samr.gov.cn/gb/search/gbDetailed?id=E4A2A4C875726A5DE05397BE0A0A... and https://openstd.samr.gov.cn/bzgk/gb/newGbInfo?hcno=A1931A578FE14957104988029... . However the IANA information has still kept for the original 2000 version, which is not better for the users.
Many thanks for this information. What we need to decide is whether to update the registration for charset "GB18030" to this new standard or whether we should define a new charset label (e.g. "GB18030-2022") to distinguish this from the old (2000 version) of the standard. This depends on how much changes there are in the new version, and how various implementers are expected to deal with the new version. Given that as far as I understand, GB 18030 is an encoding of Unicode/ISO 10646, and we do not distinguish versions in charset labels for Unicode/ISO 10646, there is a good argument for not introducing a new label for this new version. On the other hand, if there are new structural features in the 2022 version that are not present in the 2000 version, that might indicate the need for a new charset label. So any information on structural changes (or the absence thereof) as well as expectations towards industry and plans and needs from industry are greatly appreciated. From reading https://en.wikipedia.org/wiki/GB_18030, my understanding is that there are no structural changes, and that the main change is that mappings to the PUA have been completely eliminated. That means that there are some mapping differences between the 2000/2005 version and the 2022 version, but we might characterize them as minor (80 or so codepoints) and decide to keep the same label.
CESI also kindly released the mapping table at http://www.nits.org.cn/getIndex.req?action=findAllNews&req=modulenvpromote&t... . If it is not convenient to download the file from the website, you can find the attachment.
Ideally, I'd prefer if you didn't send such large files to the mailing list. But I have been unable go get any response from the above URI in a browser. What's interesting is that a ping to www.nits.org.ch works without problems, with a bit over 100 ms round trip time. If such data is available somewhere, it might be better to get a diff between the old and the new mapping tables. That should be much shorter.
I think it is better to update the information in corresponding IANA page. So, do you know how shall we need to do?
Once we know exactly what/how we want to update the information, I'll ask IANA to do so. My understanding is that this is not immediately urgent, because https://openstd.samr.gov.cn/bzgk/gb/newGbInfo?hcno=A1931A578FE14957104988029... says that the (google translate) "implementation date" (实施日期) is August 1st, 2023. Looking forward to getting additional information from anybody who has some. Regards, Martin.
Eiso
participants (2)
-
Martin J. Dürst -
陈永聪