Charset for Digest Emails

I have been subscribed to the “tz digest”, which comes out about once a day. Whenever a non-ASCII character is part of the message, it is rendered as multiple question marks. This seems to be because the message is sent with a header: Content-Type: text/plain; charset="us-ascii" This is only a minor nuisance. However, as an experiment, I changed my subscription to receive the messages directly rather than in digest form. A message today used an emoji outside the ASCII range, and it was rendered exactly as intended. It came with the following header: Content-Type: text/plain; charset="UTF-8"; format=flowed Is there a setting that could/should be changed so that the digest is sent with the same header? This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.

On 2023-01-30 19:30, Owen Leibman via tz wrote:
I have been subscribed to the “tz digest”, which comes out about once a day. Whenever a non-ASCII character is part of the message, it is rendered as multiple question marks. This seems to be because the message is sent with a header: Content-Type: text/plain; charset="us-ascii" This is only a minor nuisance. However, as an experiment, I changed my subscription to receive the messages directly rather than in digest form. A message today used an emoji outside the ASCII range, and it was rendered exactly as intended. It came with the following header: Content-Type: text/plain; charset="UTF-8"; format=flowed Is there a setting that could/should be changed so that the digest is sent with the same header? This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.
Were your digests in plain text or MIME? Does changing that selection make any difference? Given that content comes from all around the world and often contains non-ASCII and non-English text, normally in UTF-8, and patches use UTF-8, it would make sense for both plain text and MIME content to default to UTF-8. -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry

On 2023-01-30 18:30, Owen Leibman via tz wrote:
Is there a setting that could/should be changed so that the digest is sent with the same header?
I hope so, though I don't manage the mailing list so I don't know the details. While we're on the subject, I see problems even with non-digests, as the old Pipermail 0.09 (Mailman edition) software used to create the TZDB mailing archive has problems with not just charsets, but also with format=flowed and with multipart emails. The Gzip'd Text downloads are munged too. I don't know how to file a trouble ticket for this sort of thing. Perhaps it's time for icann.org to upgrade to Mailman 3? Here are examples of the problems: * Wrong charset in archive: https://mm.icann.org/pipermail/tz/attachments/20230124/95019648/0001-Remove-... This was sent with the following header in the attachment: Content-Type: text/x-patch; charset=UTF-8; name="0001-Remove-UNUSUAL_OK_IPA.patch" Content-Disposition: attachment; filename="0001-Remove-UNUSUAL_OK_IPA.patch" Content-Transfer-Encoding: base64 Unfortunately, mm.icann.org loses the "x-patch" and "charset=UTF-8", as it replaces the Content-Type with "text/plain": $ wget -S https://mm.icann.org/pipermail/tz/attachments/20230124/95019648/0001-Remove-... 2>&1 | grep -i Content-Type: Content-Type: text/plain This causes most browsers to misdisplay the text: for example, Firefox displays "UNUSUAL_OK_LATIN_1 = ¡¢£¤¥¦§¨©«¬®¯°±²³´¶·¸¹»¼½¾¿×÷" instead of the correct "UNUSUAL_OK_LATIN_1 = ¡¢£¤¥¦§¨©«¬®¯°±²³´¶·¸¹»¼½¾¿×÷". This problem seems limited to attachments, as I do not see similar problems with <https://mm.icann.org/pipermail/tz/2019-June/028158.html> where the patch was sent directly not as an attachment, and where non-ASCII strings like "Phù Liễn" are correctly translated into HTML ASCII equivalents like "Phù Liễn". * format=flowed misdisplay: https://mm.icann.org/pipermail/tz/2023-January/032565.html This was sent with "Content-Type: text/plain; charset=UTF-8; format=flowed" but the HTML web page is rendered with fixed columns which means the display looks ragged on cell phones. On my cell phone it looks like the following, which is hard to read (it's not free verse!):
theory.html says, "If boundaries between regions are fluid, such as during a war or insurrection, do not bother to create a new timezone merely because of yet another boundary change." That seems to be the case here.
* Multipart email issues: https://mm.icann.org/pipermail/tz/2023-January/032575.html This shows up on my browser as:
Hello: Maybe it's wrong to calculate local time by using the latest version tzdada2022g when zoneid is "America/Ojinaga".
Thanks

Quoting Paul Eggert via tz on Tuesday January 31, 2023:
I don't know how to file a trouble ticket for this sort of thing. Perhaps it's time for icann.org to upgrade to Mailman 3?
Perhaps coincidentally I met with the ICANN engineering team about this just last week, as they have an active project underway to upgrade to Mailman 3. They intend to have the work completed in the coming months. kim

On 1/31/23 13:11, Kim Davies via tz wrote:
Perhaps coincidentally I met with the ICANN engineering team about this just last week, as they have an active project underway to upgrade to Mailman 3. They intend to have the work completed in the coming months.
Oh, good! If possible, I'd like to volunteer this list for beta testing that upgrade, as we're not super-urgent (so it's OK if the list is broken on occasion) and our issues could help the engineering team test.

On 2023-01-31 14:22, Paul Eggert via tz wrote:
On 1/31/23 13:11, Kim Davies via tz wrote:
Perhaps coincidentally I met with the ICANN engineering team about this just last week, as they have an active project underway to upgrade to Mailman 3. They intend to have the work completed in the coming months.
Oh, good! If possible, I'd like to volunteer this list for beta testing that upgrade, as we're not super-urgent (so it's OK if the list is broken on occasion) and our issues could help the engineering team test.
Forwarded problem report to mailman@icann.org, who replied: "I have updated the default format for digest messages to use MIME which hopefully toggles that as desired." Care to try and see if that works and is fixed, and reply with cc to mailman@icann.org -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry

I tried that when requested, and then waited for a message with a non-ASCII character in the body to come along. I finally found one, not in the body of the message, but in the signature ("rien à ajouter"). That is rendered correctly in the digest, so the change you mentioned seems to work as desired. I will continue to monitor for a while (this message will, of course, have a non-ASCII character in the body). Thank you. -----Original Message----- From: Brian Inglis <Brian.Inglis@Shaw.ca> Sent: Tuesday, January 31, 2023 5:26 PM To: Time zone mailing list <tz@iana.org> Cc: Owen Leibman <OwenLeibman@fico.com> Subject: [EXTERNAL] Re: [tz] Charset for Digest Emails CAUTION: This email originated from outside the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. On 2023-01-31 14:22, Paul Eggert via tz wrote:
On 1/31/23 13:11, Kim Davies via tz wrote:
Perhaps coincidentally I met with the ICANN engineering team about this just last week, as they have an active project underway to upgrade to Mailman 3. They intend to have the work completed in the coming months.
Oh, good! If possible, I'd like to volunteer this list for beta testing that upgrade, as we're not super-urgent (so it's OK if the list is broken on occasion) and our issues could help the engineering team test.
Forwarded problem report to mailman@icann.org, who replied: "I have updated the default format for digest messages to use MIME which hopefully toggles that as desired." Care to try and see if that works and is fixed, and reply with cc to mailman@icann.org -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.

On 2/24/23 08:25:07, Owen Leibman via tz wrote:
I tried that when requested, and then waited for a message with a non-ASCII character in the body to come along. I finally found one, not in the body of the message, but in the signature ("rien à ajouter"). That is rendered correctly in the digest, so the change you mentioned seems to work as desired. I will continue to monitor for a while (this message will, of course, have a non-ASCII character in the body). Thank you.
RFC 1521 defines The Multipart/digest subtype. <https://www.freesoft.org/CIE/RFC/1521/19.htm> Why not use that for digests? o It would allow various char ets fir individual messages. o Some (alas, not all) MUAs support replies to individual messages. -- gil
participants (5)
-
Brian Inglis
-
Kim Davies
-
Owen Leibman
-
Paul Eggert
-
Paul Gilmartin