Mixing between RTL and LTR scripts
Dear all, I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed. To be away from the display issues we get if we mix RTL and LTR code points in the same labels. Thnx All the Best, Abdalmonem Tharwat Galila Deputy Manager, Dot Masr Registry, Operation Sector. [NTRA Logo 2016] National Telecommunication Regulatory Authority [Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: 1365523405_telephone] Office Tel.: +2 02 35341582 - +2 02 35341300 [Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Mobile] Mobile: +2 010 00049068 [Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: ICON] Fax : +2 02 35370537 [Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: oNLINE] Website : http:\\www.mcit.gov.eg<http://www.mcit.gov.eg/> : http:\\www.tra.gov.eg<http://www.mcit.gov.eg/> [Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: 1365523294_email] E-mail : agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg> : atharwat@tra.gov.eg<mailto:atharwat@tra.gov.eg> [Description: 1447802547_skype] Skype : abdalmonem.galila [Description: static_qr_code_without_logo] [Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: 1365523469_error]DISCLAIMER This e-mail and any files transmitted with it are confidential and intended solely for the use of the individual or entity to which they are addressed. If you have received this email in error please notify your system support manager. Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the National Telecom Regulatory Authority (NTRA) . Finally, the recipient should check this email and any attachments for the presence of viruses. The NTRA accepts no liability for any damage caused by any virus transmitted by this email.
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008. Regards, John Levine, john.levine@standcore.com Standcore LLC
Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there? A -- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com Standcore LLC
Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there?
As far as I am aware, the only place where script rules are enforced in the DNS are at the top level, where ICANN has its process, and in TLDs that have rules about what 2LDs or 3LDs they'll register. Other than that, DNS operators can publish anything they can punycode, which is any UTF-8 including punctuation and emojis. In EAI e-mail addresses, the local part before the @ sign can be any printable UTF-8, so mixed direction text is valid even though it's a bad idea. R's, John
---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com Standcore LLC
Worse, DNS operators can also anything they _can't_ run through punycode. See RFC 6055. -- Please excuse my clumbsy thums ---------- On May 4, 2018 11:15:18 "John Levine" <john.levine@standcore.com> wrote:
Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there?
As far as I am aware, the only place where script rules are enforced in the DNS are at the top level, where ICANN has its process, and in TLDs that have rules about what 2LDs or 3LDs they'll register. Other than that, DNS operators can publish anything they can punycode, which is any UTF-8 including punctuation and emojis.
In EAI e-mail addresses, the local part before the @ sign can be any printable UTF-8, so mixed direction text is valid even though it's a bad idea.
R's, John
---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com Standcore LLC
Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. عبدالمنعم-Abdo@سجل.مصر<mailto:عبدالمنعم-Abdo@سجل.مصر> Abdoعبدالمنعم.مصر … etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points in the same labels. Take a look here link<https://tools.ietf.org/html/rfc5564>. -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com>; Abdalmonem Tharwat Galila <agalila@mcit.gov.eg> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk) <ahmedbakhat@pta.gov.pk>; ua-discuss@icann.org; Ahmed Bakhat (ahmedbakhat@yahoo.com) <ahmedbakhat@yahoo.com> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there? A -- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com<mailto:john.levine@standcore.com>> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic
IDNs", I got the following regards to mixing LTR and RTL texts within the
same label.
- Mixing between different scripts is not allowed for domain names
and email addresses
- Numbers at the middle or at the end of the RTL domain name is
allowed.
To be away from the display issues we get if we mix RTL and LTR code points
in the same labels.
Thanks. I think this clarifies the point that we have no advice on
displaying e-mail addresses, since mailboxes are not domain names and are
not labels and are not subject to IDNA2008.
Regards,
John Levine, john.levine@standcore.com<mailto:john.levine@standcore.com>
Standcore LLC
Hi I think that mixing RTL and LTR scripts is not good idea, we will have multiple confusions : How we read domain name or email address ? from the RTL or LTR ? It is clear that for the arabic language, domain or email address are written and read as RTL, In other hand, mixing arabic with punctuation and emojis will not produce confusions, domain name or email adresses still read and written as RTL. Best regards 2018-05-05 9:28 GMT+01:00 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg>:
Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear
Label as a result for ex.
عبدالمنعم-Abdo@سجل.مصر
Abdoعبدالمنعم.مصر
… etc
Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same.
To be away from the display issues we get if we mix RTL and LTR code points
in the same labels.
Take a look here link <https://tools.ietf.org/html/rfc5564>.
-----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com>; Abdalmonem Tharwat Galila < agalila@mcit.gov.eg> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk) <ahmedbakhat@pta.gov.pk>; ua-discuss@icann.org; Ahmed Bakhat (ahmedbakhat@yahoo.com) < ahmedbakhat@yahoo.com> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts
Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there?
A
--
Please excuse my clumbsy thums
----------
On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com> wrote:
I hope you all doing well, after back to TF-AIDN "Task
Force Of Arabic
IDNs", I got the following regards to mixing LTR and RTL
texts within the
same label.
- Mixing between different scripts is not allowed for domain names
and email addresses
- Numbers at the middle or at the end of the RTL domain name is
allowed.
To be away from the display issues we get if we mix RTL and LTR code points
in the same labels.
Thanks. I think this clarifies the point that we have no advice on
displaying e-mail addresses, since mailboxes are not domain names and are
not labels and are not subject to IDNA2008.
Regards,
John Levine, john.levine@standcore.com
Standcore LLC
-- *cordially*, *مع تحياتي* *Zied BOUZIRI*، *زياد بوزيري* ISET Charguia, Tunisie www.bouziri.tn
Hi, In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels. A -- Please excuse my clumbsy thums On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg> wrote:
Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. عبدالمنعم-Abdo@سجل.مصر Abdoعبدالمنعم.مصر … etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points
in the same labels.
Take a look here link.
-----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com>; Abdalmonem Tharwat Galila <agalila@mcit.gov.eg> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk) <ahmedbakhat@pta.gov.pk>; ua-discuss@icann.org; Ahmed Bakhat (ahmedbakhat@yahoo.com) <ahmedbakhat@yahoo.com> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there?
A
-- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic
IDNs", I got the following regards to mixing LTR and RTL texts within the
same label. - Mixing between different scripts is not allowed for domain names
and email addresses - Numbers at the middle or at the end of the RTL domain name is
allowed.
To be away from the display issues we get if we mix RTL and LTR code points
in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com Standcore LLC
So how could any application process the domain name !! It will be RTL or LTR !! Ex Abdo.????.Ahmed Where is the 1st label !!! Is it Abdo or Ahmed !!! Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!! Sent from my iPhone On May 5, 2018, at 11:50 AM, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: Hi, In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels. A -- Please excuse my clumbsy thums ________________________________ On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. ?????????-Abdo@???.???<mailto:?????????-Abdo@???.???> Abdo?????????.??? ... etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points in the same labels. Take a look here link<https://tools.ietf.org/html/rfc5564>. -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com<mailto:john.levine@standcore.com>>; Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>) <ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>>; ua-discuss@icann.org<mailto:ua-discuss@icann.org>; Ahmed Bakhat (ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>) <ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there? A -- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com<mailto:john.levine@standcore.com>> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic
IDNs", I got the following regards to mixing LTR and RTL texts within the
same label.
- Mixing between different scripts is not allowed for domain names
and email addresses
- Numbers at the middle or at the end of the RTL domain name is
allowed.
To be away from the display issues we get if we mix RTL and LTR code points
in the same labels.
Thanks. I think this clarifies the point that we have no advice on
displaying e-mail addresses, since mailboxes are not domain names and are
not labels and are not subject to IDNA2008.
Regards,
John Levine, john.levine@standcore.com<mailto:john.levine@standcore.com>
Standcore LLC
I’m pretty sure ICANN disallows script mixing in labels via their contracts with registries. They did this to limit phishing using lookalike names that contained a mix of idn and ascii characters. But I can’t find, after a cursory look-see, where that policy/prohibition is located in the registry agreements. I’m not an expert in this area. its probably here somewhere: https://www.icann.org/resources/pages/registry-agreement-amendment-templates... <https://www.icann.org/resources/pages/registry-agreement-amendment-templates...> It is allowed across the entire domain name (TLD and SLD in different scripts)
On May 5, 2018, at 5:03 AM, Abdalmonem Tharwat Galila <agalila@mcit.gov.eg> wrote:
So how could any application process the domain name !! It will be RTL or LTR !! Ex Abdo.عبدو.Ahmed Where is the 1st label !!! Is it Abdo or Ahmed !!!
Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!!
Sent from my iPhone
On May 5, 2018, at 11:50 AM, Andrew Sullivan <ajs@anvilwalrusden.com <mailto:ajs@anvilwalrusden.com>> wrote:
Hi,
In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels.
A
-- Please excuse my clumbsy thums
On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg <mailto:agalila@mcit.gov.eg>> wrote:
Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. عبدالمنعم-Abdo@سجل.مصر <mailto:%D8%B9%D8%A8%D8%AF%D8%A7%D9%84%D9%85%D9%86%D8%B9%D9%85-Abdo@%D8%B3%D8%AC%D9%84.%D9%85%D8%B5%D8%B1> Abdoعبدالمنعم.مصر … etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Take a look here link <https://tools.ietf.org/html/rfc5564>.
-----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org <mailto:ua-discuss-bounces@icann.org>] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com <mailto:john.levine@standcore.com>>; Abdalmonem Tharwat Galila <agalila@mcit.gov.eg <mailto:agalila@mcit.gov.eg>> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk <mailto:ahmedbakhat@pta.gov.pk>) <ahmedbakhat@pta.gov.pk <mailto:ahmedbakhat@pta.gov.pk>>; ua-discuss@icann.org <mailto:ua-discuss@icann.org>; Ahmed Bakhat (ahmedbakhat@yahoo.com <mailto:ahmedbakhat@yahoo.com>) <ahmedbakhat@yahoo.com <mailto:ahmedbakhat@yahoo.com>> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts
Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there?
A
-- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com <mailto:john.levine@standcore.com>> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com <mailto:john.levine@standcore.com> Standcore LLC
On Mon, May 07, 2018 at 08:47:53AM -0700, Paul Stahura wrote:
I’m pretty sure ICANN disallows script mixing in labels via their contracts with registries.
Of course, they don't have a contract with most ccTLDs. And they don't appear to have such a rule with (e.g.) Verisign for com. And finally, ICANN just can't make rules for the whole DNS.
But I can’t find, after a cursory look-see, where that policy/prohibition is located in the registry agreements.
It's part of the IDN guidelines. They're included by reference, I think. A -- Andrew Sullivan ajs@anvilwalrusden.com
On Sat, May 05, 2018 at 12:03:37PM +0000, Abdalmonem Tharwat Galila wrote:
So how could any application process the domain name !! It will be RTL or LTR !! Ex Abdo.عبدو.Ahmed Where is the 1st label !!! Is it Abdo or Ahmed !!!
Yep. It's extremly difficult. "Don't do that," is the general advice, but there's no way in the protocol to prevent it. Remember, every one of those dots in that domain name represents a possible new locus of control (because there could be a delegation there). You can't make policies about the whole domain name system. That's a feature of the DNS, and not something that is likely to change.
Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!!
There actually is _some_ discussion of this in the BIDI RFC (https://tools.ietf.org/html/rfc5893). But IDNA is defined for labels, not domain names, because of the way the DNS works. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
Time to revive a blog article which I wrote in March 2016😀 My blog article is about presentation of Arabic Email addresses ➜ schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> Using this presentation method would make components of an email address or domain name clearer even when mixing LTR and RTL André Schappo On 5 May 2018, at 13:03, Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: So how could any application process the domain name !! It will be RTL or LTR !! Ex Abdo.عبدو.Ahmed Where is the 1st label !!! Is it Abdo or Ahmed !!! Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!! Sent from my iPhone On May 5, 2018, at 11:50 AM, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: Hi, In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels. A -- Please excuse my clumbsy thums ________________________________ On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. عبدالمنعم-Abdo@سجل.مصر<mailto:%D8%B9%D8%A8%D8%AF%D8%A7%D9%84%D9%85%D9%86%D8%B9%D9%85-Abdo@%D8%B3%D8%AC%D9%84.%D9%85%D8%B5%D8%B1> Abdoعبدالمنعم.مصر … etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points in the same labels. Take a look here link<https://tools.ietf.org/html/rfc5564>. -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com<mailto:john.levine@standcore.com>>; Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>) <ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>>; ua-discuss@icann.org<mailto:ua-discuss@icann.org>; Ahmed Bakhat (ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>) <ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there? A -- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com<mailto:john.levine@standcore.com>> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com<mailto:john.levine@standcore.com> Standcore LLC
🌏 🌍 🌎 André Schappo 小山@电邮.在线?Subject=你好小山😜<mailto:%E5%B0%8F%E5%B1%B1@%E7%94%B5%E9%82%AE.%E5%9C%A8%E7%BA%BF?Subject=%E4%BD%A0%E5%A5%BD%E5%B0%8F%E5%B1%B1%F0%9F%98%9C> schappo.blogspot.co.uk<https://schappo.blogspot.co.uk> twitter.com/andreschappo<https://twitter.com/andreschappo> weibo.com/andreschappo?is_all=1<https://weibo.com/andreschappo?is_all=1> groups.google.com/forum/#!forum/computer-science-curriculum-internationalization<https://groups.google.com/forum/#!forum/computer-science-curriculum-internat...>
That is for HTML presentation for RTL domains or emails . What we were talking about is somehow different !!! Sent from my iPhone On May 8, 2018, at 4:24 PM, Andre Schappo <A.Schappo@lboro.ac.uk<mailto:A.Schappo@lboro.ac.uk>> wrote: Time to revive a blog article which I wrote in March 2016😀 My blog article is about presentation of Arabic Email addresses ➜ schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> Using this presentation method would make components of an email address or domain name clearer even when mixing LTR and RTL André Schappo On 5 May 2018, at 13:03, Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: So how could any application process the domain name !! It will be RTL or LTR !! Ex Abdo.عبدو.Ahmed Where is the 1st label !!! Is it Abdo or Ahmed !!! Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!! Sent from my iPhone On May 5, 2018, at 11:50 AM, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: Hi, In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels. A -- Please excuse my clumbsy thums ________________________________ On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. عبدالمنعم-Abdo@سجل.مصر<mailto:%D8%B9%D8%A8%D8%AF%D8%A7%D9%84%D9%85%D9%86%D8%B9%D9%85-Abdo@%D8%B3%D8%AC%D9%84.%D9%85%D8%B5%D8%B1> Abdoعبدالمنعم.مصر … etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points in the same labels. Take a look here link<https://tools.ietf.org/html/rfc5564>. -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com<mailto:john.levine@standcore.com>>; Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>) <ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>>; ua-discuss@icann.org<mailto:ua-discuss@icann.org>; Ahmed Bakhat (ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>) <ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there? A -- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com<mailto:john.levine@standcore.com>> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com<mailto:john.levine@standcore.com> Standcore LLC
🌏 🌍 🌎 André Schappo 小山@电邮.在线?Subject=你好小山😜<mailto:%E5%B0%8F%E5%B1%B1@%E7%94%B5%E9%82%AE.%E5%9C%A8%E7%BA%BF?Subject=%E4%BD%A0%E5%A5%BD%E5%B0%8F%E5%B1%B1%F0%9F%98%9C> schappo.blogspot.co.uk<https://schappo.blogspot.co.uk> twitter.com/andreschappo<https://twitter.com/andreschappo> weibo.com/andreschappo?is_all=1<https://weibo.com/andreschappo?is_all=1> groups.google.com/forum/#!forum/computer-science-curriculum-internationalization<https://groups.google.com/forum/#!forum/computer-science-curriculum-internat...>
I am not an expert, so I might say a stupid thing, but to the best of my knowledge arabic is written RTL but arabic numbers are written LTR, like the numbers in ASCII. So if we have a string that mixes alpha and digit chars owe naturally have a mix of LTR and RTL. What am I missing? Cheers, Roberto On 08.05.2018, at 20:31, Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: That is for HTML presentation for RTL domains or emails . What we were talking about is somehow different !!! Sent from my iPhone On May 8, 2018, at 4:24 PM, Andre Schappo <A.Schappo@lboro.ac.uk<mailto:A.Schappo@lboro.ac.uk>> wrote: Time to revive a blog article which I wrote in March 2016😀 My blog article is about presentation of Arabic Email addresses ➜ schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> Using this presentation method would make components of an email address or domain name clearer even when mixing LTR and RTL André Schappo On 5 May 2018, at 13:03, Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: So how could any application process the domain name !! It will be RTL or LTR !! Ex Abdo.عبدو.Ahmed Where is the 1st label !!! Is it Abdo or Ahmed !!! Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!! Sent from my iPhone On May 5, 2018, at 11:50 AM, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: Hi, In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels. A -- Please excuse my clumbsy thums ________________________________ On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. عبدالمنعم-Abdo@سجل.مصر<mailto:%D8%B9%D8%A8%D8%AF%D8%A7%D9%84%D9%85%D9%86%D8%B9%D9%85-Abdo@%D8%B3%D8%AC%D9%84.%D9%85%D8%B5%D8%B1> Abdoعبدالمنعم.مصر … etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points in the same labels. Take a look here link<https://tools.ietf.org/html/rfc5564>. -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com<mailto:john.levine@standcore.com>>; Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>) <ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>>; ua-discuss@icann.org<mailto:ua-discuss@icann.org>; Ahmed Bakhat (ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>) <ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there? A -- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com<mailto:john.levine@standcore.com>> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com<mailto:john.levine@standcore.com> Standcore LLC
🌏 🌍 🌎 André Schappo 小山@电邮.在线?Subject=你好小山😜<mailto:%E5%B0%8F%E5%B1%B1@%E7%94%B5%E9%82%AE.%E5%9C%A8%E7%BA%BF?Subject=%E4%BD%A0%E5%A5%BD%E5%B0%8F%E5%B1%B1%F0%9F%98%9C> schappo.blogspot.co.uk<https://schappo.blogspot.co.uk/> twitter.com/andreschappo<https://twitter.com/andreschappo> weibo.com/andreschappo?is_all=1<https://weibo.com/andreschappo?is_all=1> groups.google.com/forum/#!forum/computer-science-curriculum-internationalization<https://groups.google.com/forum/#!forum/computer-science-curriculum-internat...>
The answer is not far by answering this question , How do you read the numbers whatever RTL or LTR from left or from right !!! Ex 123 123 or ١٢٣ are the same reading so your assumption is not valid . Sent from my iPhone On May 8, 2018, at 10:36 PM, Roberto Gaetano <roberto_gaetano@hotmail.com<mailto:roberto_gaetano@hotmail.com>> wrote: I am not an expert, so I might say a stupid thing, but to the best of my knowledge arabic is written RTL but arabic numbers are written LTR, like the numbers in ASCII. So if we have a string that mixes alpha and digit chars owe naturally have a mix of LTR and RTL. What am I missing? Cheers, Roberto On 08.05.2018, at 20:31, Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: That is for HTML presentation for RTL domains or emails . What we were talking about is somehow different !!! Sent from my iPhone On May 8, 2018, at 4:24 PM, Andre Schappo <A.Schappo@lboro.ac.uk<mailto:A.Schappo@lboro.ac.uk>> wrote: Time to revive a blog article which I wrote in March 2016😀 My blog article is about presentation of Arabic Email addresses ➜ schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> Using this presentation method would make components of an email address or domain name clearer even when mixing LTR and RTL André Schappo On 5 May 2018, at 13:03, Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: So how could any application process the domain name !! It will be RTL or LTR !! Ex Abdo.عبدو.Ahmed Where is the 1st label !!! Is it Abdo or Ahmed !!! Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!! Sent from my iPhone On May 5, 2018, at 11:50 AM, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: Hi, In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels. A -- Please excuse my clumbsy thums ________________________________ On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. عبدالمنعم-Abdo@سجل.مصر<mailto:%D8%B9%D8%A8%D8%AF%D8%A7%D9%84%D9%85%D9%86%D8%B9%D9%85-Abdo@%D8%B3%D8%AC%D9%84.%D9%85%D8%B5%D8%B1> Abdoعبدالمنعم.مصر … etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points in the same labels. Take a look here link<https://tools.ietf.org/html/rfc5564>. -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com<mailto:john.levine@standcore.com>>; Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>) <ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>>; ua-discuss@icann.org<mailto:ua-discuss@icann.org>; Ahmed Bakhat (ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>) <ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there? A -- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com<mailto:john.levine@standcore.com>> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com<mailto:john.levine@standcore.com> Standcore LLC
🌏 🌍 🌎 André Schappo 小山@电邮.在线?Subject=你好小山😜<mailto:%E5%B0%8F%E5%B1%B1@%E7%94%B5%E9%82%AE.%E5%9C%A8%E7%BA%BF?Subject=%E4%BD%A0%E5%A5%BD%E5%B0%8F%E5%B1%B1%F0%9F%98%9C> schappo.blogspot.co.uk<https://schappo.blogspot.co.uk/> twitter.com/andreschappo<https://twitter.com/andreschappo> weibo.com/andreschappo?is_all=1<https://weibo.com/andreschappo?is_all=1> groups.google.com/forum/#!forum/computer-science-curriculum-internationalization<https://groups.google.com/forum/#!forum/computer-science-curriculum-internat...>
Dear Andre & All, Please allow me to resend my comments on the blog article that was shared by Andre in his last email: Dear Andre, With great interest and appreciation, I have read your posts in the UA-discuss mailing list as well as your recent blog article<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> on how to handle Arabic emails address in RTL and LTR context. Your suggestion sounds good at first glance. However, it is confusing and puzzling when you look at it from the point of view of an ordinary Arabic speaking user. Hence, I have the following comments that I would like first to share with you before posting them to the mailing list. I will be using examples to illustrate my point of view. They are demonstrated from native Arabic speaking point view. (Please note that I will be using pictures for the texts so that they will not be ruined when transferred by the email system) Consider my email address: rfayez@citc.gov.sa As a (normal) user I can easily make out the following (correct) assumptions regardless of the text direction: 1. The user part is always to the left-side of the sign (@): rfayez 2. The domain name is always to the right-side of the sign (@): citc.gov.sa 3. A domain name is arranged in a well-defined label hierarchy where a TLD is always the rightmost label of the domain name: .sa So I will use my email address as is (rfayez@citc.gov.sa) without changing its direction or swapping between its parts. Consider the following examples where my email address is used in different text writing directions: [cid:image001.jpg@01D3E78C.AED775B0] As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form (i.e., user@domain.TLD). This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples: care.sa@car.com car.com@care.sa will be straightforwardly interpreted as follows (no confusion whatsoever): [cid:image002.jpg@01D3E78C.AED775B0] Now let us repeat the examples using an Arabic email address: [cid:image003.jpg@01D3E78C.AED775B0] Here a native Arabic-speaking user would make the following assumptions as well regardless of the text direction: 1. The user part is always to the right-side of the sign (@): اندري 2. The domain name is always to the left-side of the sign (@): رسيل.السعودية 3. A domain name is arranged in a well-defined label hierarchy where an Arabic TLD is always the leftmost label of the domain name: .السعودية Therefore, the given Arabic email address: [cid:image004.jpg@01D3E78C.AED775B0] should be used without changing its direction or swapping between its parts to maintain its form and hence remove any confusion or misinterpretation. Consider the following examples where the previous email address is used in different text writing directions: [cid:image005.jpg@01D3E78C.AED775B0] As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form. This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples: [cid:image006.jpg@01D3E78C.AED775B0] will be straightforwardly interpreted as follows (no confusion whatsoever): [cid:image007.jpg@01D3E78C.AED775B0] However, if your suggestion is followed then the above email addresses will be used as follows depending in the text direction: [cid:image008.jpg@01D3E78C.AED775B0] and frankly this is absolutely confusing. As an Arabic speaker we were dealing with LTR and RTL together long time ago (far before Computers where invented) because our Arabic alphabetic is RTL while the Arabic numbers are LTR. So if I want to write the following sentience in Arabic "My salary is 321 Pound" I will write it like this: [cid:image002.jpg@01D18AA7.51F1AEE0] Or [cid:image004.jpg@01D18AA7.51F1AEE0] And not like this: [cid:image010.jpg@01D18AA7.51F1AEE0] Nor [cid:image012.jpg@01D18AA7.51F1AEE0] Since to any Arabic user the last two images means "My salary is 123 Pound"! Later, when computer was introduced in our region (1980s) we used to write English names within the Arabic text without chaining their direction. In other words, if I want to write the following sentence in Arabic (without Arabizing the English names) "We welcome the interest of Mr. André Schappo in the Arabic language" then I will write it like this: [cid:image015.jpg@01D18AA7.51F1AEE0] not like this: [cid:image017.jpg@01D18AA7.51F1AEE0] And definitely not like this: [cid:image021.jpg@01D18AA7.51F1AEE0] Moreover, when the internet was introduced (1990s) we used to write domains and email addresses in a similar manner and as what I have explained to you in my previous email. I hope that I have clarified the view of an Arabic speaker regarding your thoughts on how to handle RTL in LTR context and vice versa. BTW the following represent a sample of tons of examples form famous Newspapers inside the Arabic world: http://www.alriyadh.com/975687 [cid:image024.jpg@01D18AA7.51F1AEE0] http://www.albayan.ae/economy/last-deal/2011-12-09-1.1551679 [cid:image025.jpg@01D18AA7.51F1AEE0] http://aitmag.ahram.org.eg/News/38238.aspx [cid:image026.jpg@01D18AA7.51F1AEE0] With best regards, Raed From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andre Schappo Sent: Tuesday, May 08, 2018 6:49 PM To: ua-discuss@icann.org Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Time to revive a blog article which I wrote in March 2016😀 My blog article is about presentation of Arabic Email addresses ➜ schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> Using this presentation method would make components of an email address or domain name clearer even when mixing LTR and RTL André Schappo On 5 May 2018, at 13:03, Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: So how could any application process the domain name !! It will be RTL or LTR !! Ex Abdo.عبدو.Ahmed Where is the 1st label !!! Is it Abdo or Ahmed !!! Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!! Sent from my iPhone On May 5, 2018, at 11:50 AM, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: Hi, In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels. A -- Please excuse my clumbsy thums ________________________________ On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. عبدالمنعم-Abdo@سجل.مصر<mailto:%D8%B9%D8%A8%D8%AF%D8%A7%D9%84%D9%85%D9%86%D8%B9%D9%85-Abdo@%D8%B3%D8%AC%D9%84.%D9%85%D8%B5%D8%B1> Abdoعبدالمنعم.مصر … etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points in the same labels. Take a look here link<https://tools.ietf.org/html/rfc5564>. -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com<mailto:john.levine@standcore.com>>; Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>) <ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>>; ua-discuss@icann.org<mailto:ua-discuss@icann.org>; Ahmed Bakhat (ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>) <ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there? A -- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com<mailto:john.levine@standcore.com>> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com<mailto:john.levine@standcore.com> Standcore LLC
🌏 🌍 🌎 André Schappo 小山@电邮.在线?Subject=你好小山😜<mailto:%E5%B0%8F%E5%B1%B1@%E7%94%B5%E9%82%AE.%E5%9C%A8%E7%BA%BF?Subject=%E4%BD%A0%E5%A5%BD%E5%B0%8F%E5%B1%B1%F0%9F%98%9C> schappo.blogspot.co.uk<https://schappo.blogspot.co.uk> twitter.com/andreschappo<https://twitter.com/andreschappo> weibo.com/andreschappo?is_all=1<https://weibo.com/andreschappo?is_all=1> groups.google.com/forum/#!forum/computer-science-curriculum-internationalization<https://groups.google.com/forum/#!forum/computer-science-curriculum-internat...>
Thank you Raed for this beautiful explanation 2018-05-09 9:55 GMT+01:00 Raed AlFayez <rfayez@citc.gov.sa>:
Dear Andre & All,
Please allow me to resend my comments on the blog article that was shared by Andre in his last email:
Dear Andre,
With great interest and appreciation, I have read your posts in the UA-discuss mailing list as well as your recent blog article <http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> on how to handle Arabic emails address in RTL and LTR context.
Your suggestion sounds good at first glance. However, it is confusing and puzzling when you look at it from the point of view of an ordinary Arabic speaking user. Hence, I have the following comments that I would like first to share with you before posting them to the mailing list. I will be using examples to illustrate my point of view. They are demonstrated from native Arabic speaking point view.
(Please note that I will be using pictures for the texts so that they will not be ruined when transferred by the email system)
Consider my email address:
*rfayez@citc.gov.sa <rfayez@citc.gov.sa>*
As a (normal) user I can easily make out the following (correct) assumptions regardless of the text direction:
1. The user part is always to the left-side of the sign (@): *rfayez*
2. The domain name is always to the right-side of the sign (@): *citc.gov.sa <http://citc.gov.sa>*
3. A domain name is arranged in a well-defined label hierarchy where a TLD is always the rightmost label of the domain name: *.sa*
So I will use my email address as is (*rfayez@citc.gov.sa <rfayez@citc.gov.sa>*) without changing its direction or swapping between its parts. Consider the following examples where my email address is used in different text writing directions:
As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form (i.e., user@domain.TLD). This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples:
care.sa@car.com
car.com@care.sa
will be straightforwardly interpreted as follows (no confusion whatsoever):
Now let us repeat the examples using an Arabic email address:
Here a native Arabic-speaking user would make the following assumptions as well regardless of the text direction:
1. The user part is always to the right-side of the sign (@): *اندري*
2. The domain name is always to the left-side of the sign (@): *رسيل.السعودية*
3. A domain name is arranged in a well-defined label hierarchy where an Arabic TLD is always the leftmost label of the domain name: *.السعودية*
Therefore, the given Arabic email address:
should be used without changing its direction or swapping between its parts to maintain its form and hence remove any confusion or misinterpretation.
Consider the following examples where the previous email address
is used in different text writing directions:
As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form.
This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples:
will be straightforwardly interpreted as follows (no confusion whatsoever):
However, if your suggestion is followed then the above email addresses will be used as follows depending in the text direction:
and frankly this is absolutely confusing.
As an Arabic speaker we were dealing with LTR and RTL together long time ago (far before Computers where invented) because our Arabic alphabetic is RTL while the Arabic numbers are LTR.
So if I want to write the following sentience in Arabic "*My salary is **321 **Pound*" I will write it like this:
[image: cid:image002.jpg@01D18AA7.51F1AEE0]
Or
[image: cid:image004.jpg@01D18AA7.51F1AEE0]
And *not* like this:
[image: cid:image010.jpg@01D18AA7.51F1AEE0]
Nor
[image: cid:image012.jpg@01D18AA7.51F1AEE0]
Since to any Arabic user the last two images means "*My salary is **123 * *Pound*"!
Later, when computer was introduced in our region (1980s) we used to write English names within the Arabic text without chaining their direction. In other words, if I want to write the following sentence in Arabic (without Arabizing the English names) "*We welcome the interest of Mr. **André Schappo **in the Arabic language*" then I will write it like this:
[image: cid:image015.jpg@01D18AA7.51F1AEE0]
*not* like this:
[image: cid:image017.jpg@01D18AA7.51F1AEE0]
And *definitely* *not* like this:
[image: cid:image021.jpg@01D18AA7.51F1AEE0]
Moreover, when the internet was introduced (1990s) we used to write domains and email addresses in a similar manner and as what I have explained to you in my previous email.
I hope that I have clarified the view of an Arabic speaker regarding your thoughts on how to handle RTL in LTR context and vice versa.
BTW the following represent a sample of tons of examples form famous Newspapers inside the Arabic world:
http://www.alriyadh.com/975687
[image: cid:image024.jpg@01D18AA7.51F1AEE0]
http://www.albayan.ae/economy/last-deal/2011-12-09-1.1551679
[image: cid:image025.jpg@01D18AA7.51F1AEE0]
http://aitmag.ahram.org.eg/News/38238.aspx
[image: cid:image026.jpg@01D18AA7.51F1AEE0]
With best regards,
Raed
*From:* UA-discuss [mailto:ua-discuss-bounces@icann.org] *On Behalf Of *Andre Schappo *Sent:* Tuesday, May 08, 2018 6:49 PM *To:* ua-discuss@icann.org
*Subject:* Re: [UA-discuss] Mixing between RTL and LTR scripts
Time to revive a blog article which I wrote in March 2016😀 My blog article is about presentation of Arabic Email addresses ➜ schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html Using this presentation method would make components of an email address or domain name clearer even when mixing LTR and RTL
André Schappo
On 5 May 2018, at 13:03, Abdalmonem Tharwat Galila <agalila@mcit.gov.eg> wrote:
So how could any application process the domain name !! It will be RTL or LTR !!
Ex Abdo.عبدو.Ahmed
Where is the 1st label !!! Is it Abdo or Ahmed !!!
Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!!
Sent from my iPhone
On May 5, 2018, at 11:50 AM, Andrew Sullivan <ajs@anvilwalrusden.com> wrote:
Hi,
In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels.
A
--
Please excuse my clumbsy thums
------------------------------
On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg> wrote:
Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear
Label as a result for ex.
عبدالمنعم-Abdo@سجل.مصر
Abdoعبدالمنعم.مصر
… etc
Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same.
To be away from the display issues we get if we mix RTL and LTR code points
in the same labels.
Take a look here link <https://tools.ietf.org/html/rfc5564>.
-----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org <ua-discuss-bounces@icann.org>] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com>; Abdalmonem Tharwat Galila < agalila@mcit.gov.eg> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk) <ahmedbakhat@pta.gov.pk>; ua-discuss@icann.org; Ahmed Bakhat (ahmedbakhat@yahoo.com) < ahmedbakhat@yahoo.com> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts
Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there?
A
--
Please excuse my clumbsy thums
----------
On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com> wrote:
I hope you all doing well, after back to TF-AIDN "Task
Force Of Arabic
IDNs", I got the following regards to mixing LTR and RTL
texts within the
same label.
- Mixing between different scripts is not allowed for domain names
and email addresses
- Numbers at the middle or at the end of the RTL domain name is
allowed.
To be away from the display issues we get if we mix RTL and LTR code points
in the same labels.
Thanks. I think this clarifies the point that we have no advice on
displaying e-mail addresses, since mailboxes are not domain names and are
not labels and are not subject to IDNA2008.
Regards,
John Levine, john.levine@standcore.com
Standcore LLC
🌏 🌍 🌎 André Schappo 小山@电邮.在线?Subject=你好小山😜 schappo.blogspot.co.uk twitter.com/andreschappo weibo.com/andreschappo?is_all=1 groups.google.com/forum/#!forum/computer-science-curriculum- internationalization
-- *cordially*, *مع تحياتي* *Zied BOUZIRI*، *زياد بوزيري* ISET Charguia, Tunisie www.bouziri.tn
Il 9 maggio 2018 alle 10.55 Raed AlFayez <rfayez@citc.gov.sa> ha scritto:
Dear Andre & All,
Please allow me to resend my comments on the blog article that was shared by Andre in his last email:
Thanks for this, it is very interesting and very useful to non-Arabic speakers. I understand your point: the natural expectation of an Arabic speaker is that the email address will always be displayed LTR if it is in Latin characters, while it will always be displayed RTL if it is in Arabic characters, no matter whether the surrounding text is LTR or RTL. But what would/should happen if the email address had a mix of Latin and Arabic characters? Also, is in your opinion this expectation shared by all the peoples in the (very big) Arabic-speaking world, and even by those that speak other languages written with Arabic characters? I am wondering whether these rules depend on the country, on the script or on the language. Thanks Regards -- Vittorio Bertola | Head of Policy & Innovation, Open-Xchange vittorio.bertola@open-xchange.com mailto:vittorio.bertola@open-xchange.com Office @ Via Treviso 12, 10144 Torino, Italy
Hi I think Yes , we share this expectation. And this is what we work in the TF-AIDN : https://www.icann.org/news/blog/making-progress-with-the-arabic-script-in-do... TF-AIDN has 35 members hailing from 18 countries across the globe, from the the (very big) Arabic-speaking world as you say ! The members come from a variety of backgrounds and cover a wide range of Asian and African languages using the Arabic script. Best regards 2018-05-09 10:12 GMT+01:00 Vittorio Bertola < vittorio.bertola@open-xchange.com>:
Il 9 maggio 2018 alle 10.55 Raed AlFayez <rfayez@citc.gov.sa> ha scritto:
Dear Andre & All,
Please allow me to resend my comments on the blog article that was shared by Andre in his last email:
Thanks for this, it is very interesting and very useful to non-Arabic speakers.
I understand your point: the natural expectation of an Arabic speaker is that the email address will always be displayed LTR if it is in Latin characters, while it will always be displayed RTL if it is in Arabic characters, no matter whether the surrounding text is LTR or RTL.
But what would/should happen if the email address had a mix of Latin and Arabic characters?
Also, is in your opinion this expectation shared by all the peoples in the (very big) Arabic-speaking world, and even by those that speak other languages written with Arabic characters? I am wondering whether these rules depend on the country, on the script or on the language.
Thanks Regards
--
Vittorio Bertola | Head of Policy & Innovation, Open-Xchange vittorio.bertola@open-xchange.com Office @ Via Treviso 12, 10144 Torino, Italy <https://maps.google.com/?q=Via+Treviso+12,+10144+Torino,+Italy&entry=gmail&s...>
-- *cordially*, *مع تحياتي* *Zied BOUZIRI*، *زياد بوزيري* ISET Charguia, Tunisie www.bouziri.tn
Any updates to the UASG 007 document (and other relevant UASG documents) needed based on this discussion? Edmon From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Zied BOUZIRI Sent: Wednesday, May 9, 2018 5:31 PM To: Vittorio Bertola <vittorio.bertola@open-xchange.com> Cc: ua-discuss <ua-discuss@icann.org> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Hi I think Yes , we share this expectation. And this is what we work in the TF-AIDN : https://www.icann.org/news/blog/making-progress-with-the-arabic-script-in-do... TF-AIDN has 35 members hailing from 18 countries across the globe, from the the (very big) Arabic-speaking world as you say ! The members come from a variety of backgrounds and cover a wide range of Asian and African languages using the Arabic script. Best regards 2018-05-09 10:12 GMT+01:00 Vittorio Bertola <vittorio.bertola@open-xchange.com <mailto:vittorio.bertola@open-xchange.com> >: Il 9 maggio 2018 alle 10.55 Raed AlFayez <rfayez@citc.gov.sa <mailto:rfayez@citc.gov.sa> > ha scritto: Dear Andre & All, Please allow me to resend my comments on the blog article that was shared by Andre in his last email: Thanks for this, it is very interesting and very useful to non-Arabic speakers. I understand your point: the natural expectation of an Arabic speaker is that the email address will always be displayed LTR if it is in Latin characters, while it will always be displayed RTL if it is in Arabic characters, no matter whether the surrounding text is LTR or RTL. But what would/should happen if the email address had a mix of Latin and Arabic characters? Also, is in your opinion this expectation shared by all the peoples in the (very big) Arabic-speaking world, and even by those that speak other languages written with Arabic characters? I am wondering whether these rules depend on the country, on the script or on the language. Thanks Regards -- Vittorio Bertola | Head of Policy & Innovation, Open-Xchange vittorio.bertola@open-xchange.com <mailto:vittorio.bertola@open-xchange.com> Office @ Via Treviso 12, 10144 Torino, Italy <https://maps.google.com/?q=Via+Treviso+12,+10144+Torino,+Italy&entry=gmail&s...> -- cordially, مع تحياتي Zied BOUZIRI، زياد بوزيري ISET Charguia, Tunisie www.bouziri.tn <http://www.bouziri.tn>
Ah😀 Thank you for reminding me Raed André On 9 May 2018, at 09:55, Raed AlFayez <rfayez@citc.gov.sa<mailto:rfayez@citc.gov.sa>> wrote: Dear Andre & All, Please allow me to resend my comments on the blog article that was shared by Andre in his last email: Dear Andre, With great interest and appreciation, I have read your posts in the UA-discuss mailing list as well as your recent blog article<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> on how to handle Arabic emails address in RTL and LTR context. Your suggestion sounds good at first glance. However, it is confusing and puzzling when you look at it from the point of view of an ordinary Arabic speaking user. Hence, I have the following comments that I would like first to share with you before posting them to the mailing list. I will be using examples to illustrate my point of view. They are demonstrated from native Arabic speaking point view. (Please note that I will be using pictures for the texts so that they will not be ruined when transferred by the email system) Consider my email address: rfayez@citc.gov.sa<mailto:rfayez@citc.gov.sa> As a (normal) user I can easily make out the following (correct) assumptions regardless of the text direction: 1. The user part is always to the left-side of the sign (@): rfayez 2. The domain name is always to the right-side of the sign (@): citc.gov.sa 3. A domain name is arranged in a well-defined label hierarchy where a TLD is always the rightmost label of the domain name: .sa So I will use my email address as is (rfayez@citc.gov.sa<mailto:rfayez@citc.gov.sa>) without changing its direction or swapping between its parts. Consider the following examples where my email address is used in different text writing directions: <image001.jpg> As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form (i.e., user@domain.TLD<mailto:user@domain.TLD>). This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples: care.sa@car.com<mailto:care.sa@car.com> car.com@care.sa<mailto:car.com@care.sa> will be straightforwardly interpreted as follows (no confusion whatsoever): <image002.jpg> Now let us repeat the examples using an Arabic email address: <image003.jpg> Here a native Arabic-speaking user would make the following assumptions as well regardless of the text direction: 1. The user part is always to the right-side of the sign (@): اندري 2. The domain name is always to the left-side of the sign (@): رسيل.السعودية 3. A domain name is arranged in a well-defined label hierarchy where an Arabic TLD is always the leftmost label of the domain name: .السعودية Therefore, the given Arabic email address: <image004.jpg> should be used without changing its direction or swapping between its parts to maintain its form and hence remove any confusion or misinterpretation. Consider the following examples where the previous email address is used in different text writing directions: <image005.jpg> As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form. This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples: <image006.jpg> will be straightforwardly interpreted as follows (no confusion whatsoever): <image007.jpg> However, if your suggestion is followed then the above email addresses will be used as follows depending in the text direction: <image008.jpg> and frankly this is absolutely confusing. As an Arabic speaker we were dealing with LTR and RTL together long time ago (far before Computers where invented) because our Arabic alphabetic is RTL while the Arabic numbers are LTR. So if I want to write the following sentience in Arabic "My salary is 321 Pound" I will write it like this: <image009.jpg> Or <image010.jpg> And not like this: <image011.jpg> Nor <image012.jpg> Since to any Arabic user the last two images means "My salary is 123 Pound"! Later, when computer was introduced in our region (1980s) we used to write English names within the Arabic text without chaining their direction. In other words, if I want to write the following sentence in Arabic (without Arabizing the English names) "We welcome the interest of Mr. André Schappo in the Arabic language" then I will write it like this: <image013.jpg> not like this: <image014.jpg> And definitely not like this: <image015.jpg> Moreover, when the internet was introduced (1990s) we used to write domains and email addresses in a similar manner and as what I have explained to you in my previous email. I hope that I have clarified the view of an Arabic speaker regarding your thoughts on how to handle RTL in LTR context and vice versa. BTW the following represent a sample of tons of examples form famous Newspapers inside the Arabic world: http://www.alriyadh.com/975687 <image016.jpg> http://www.albayan.ae/economy/last-deal/2011-12-09-1.1551679 <image017.jpg> http://aitmag.ahram.org.eg/News/38238.aspx <image018.jpg> With best regards, Raed From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andre Schappo Sent: Tuesday, May 08, 2018 6:49 PM To: ua-discuss@icann.org<mailto:ua-discuss@icann.org> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Time to revive a blog article which I wrote in March 2016😀 My blog article is about presentation of Arabic Email addresses ➜ schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> Using this presentation method would make components of an email address or domain name clearer even when mixing LTR and RTL André Schappo On 5 May 2018, at 13:03, Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: So how could any application process the domain name !! It will be RTL or LTR !! Ex Abdo.عبدو.Ahmed Where is the 1st label !!! Is it Abdo or Ahmed !!! Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!! Sent from my iPhone On May 5, 2018, at 11:50 AM, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: Hi, In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels. A -- Please excuse my clumbsy thums ________________________________ On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. عبدالمنعم-Abdo@سجل.مصر<mailto:%D8%B9%D8%A8%D8%AF%D8%A7%D9%84%D9%85%D9%86%D8%B9%D9%85-Abdo@%D8%B3%D8%AC%D9%84.%D9%85%D8%B5%D8%B1> Abdoعبدالمنعم.مصر … etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points in the same labels. Take a look here link<https://tools.ietf.org/html/rfc5564>. -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com<mailto:john.levine@standcore.com>>; Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>) <ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>>; ua-discuss@icann.org<mailto:ua-discuss@icann.org>; Ahmed Bakhat (ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>) <ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there? A -- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com<mailto:john.levine@standcore.com>> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com<mailto:john.levine@standcore.com> Standcore LLC
Thanks Raed. Just a further comment. There are situations in which an email address could have mixed scripts. For instance, some ASCII TLDs allow IDN at the second level. This can bring email addresses like this one (sorry, but I do not have the skills for making a graphic representation): <Arabic script user>@<Arabic script SLD>.<Latin script TLD> or maybe even: <Latin script user>@<Arabic script SLD>.<Latin script TLD> Can you please describe what would happen in this case? Thanks, Roberto On 09.05.2018, at 10:55, Raed AlFayez <rfayez@citc.gov.sa<mailto:rfayez@citc.gov.sa>> wrote: Dear Andre & All, Please allow me to resend my comments on the blog article that was shared by Andre in his last email: Dear Andre, With great interest and appreciation, I have read your posts in the UA-discuss mailing list as well as your recent blog article<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> on how to handle Arabic emails address in RTL and LTR context. Your suggestion sounds good at first glance. However, it is confusing and puzzling when you look at it from the point of view of an ordinary Arabic speaking user. Hence, I have the following comments that I would like first to share with you before posting them to the mailing list. I will be using examples to illustrate my point of view. They are demonstrated from native Arabic speaking point view. (Please note that I will be using pictures for the texts so that they will not be ruined when transferred by the email system) Consider my email address: rfayez@citc.gov.sa<mailto:rfayez@citc.gov.sa> As a (normal) user I can easily make out the following (correct) assumptions regardless of the text direction: 1. The user part is always to the left-side of the sign (@): rfayez 2. The domain name is always to the right-side of the sign (@): citc.gov.sa 3. A domain name is arranged in a well-defined label hierarchy where a TLD is always the rightmost label of the domain name: .sa So I will use my email address as is (rfayez@citc.gov.sa<mailto:rfayez@citc.gov.sa>) without changing its direction or swapping between its parts. Consider the following examples where my email address is used in different text writing directions: <image001.jpg> As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form (i.e., user@domain.TLD<mailto:user@domain.TLD>). This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples: care.sa@car.com<mailto:care.sa@car.com> car.com@care.sa<mailto:car.com@care.sa> will be straightforwardly interpreted as follows (no confusion whatsoever): <image002.jpg> Now let us repeat the examples using an Arabic email address: <image003.jpg> Here a native Arabic-speaking user would make the following assumptions as well regardless of the text direction: 1. The user part is always to the right-side of the sign (@): اندري 2. The domain name is always to the left-side of the sign (@): رسيل.السعودية 3. A domain name is arranged in a well-defined label hierarchy where an Arabic TLD is always the leftmost label of the domain name: .السعودية Therefore, the given Arabic email address: <image004.jpg> should be used without changing its direction or swapping between its parts to maintain its form and hence remove any confusion or misinterpretation. Consider the following examples where the previous email address is used in different text writing directions: <image005.jpg> As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form. This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples: <image006.jpg> will be straightforwardly interpreted as follows (no confusion whatsoever): <image007.jpg> However, if your suggestion is followed then the above email addresses will be used as follows depending in the text direction: <image008.jpg> and frankly this is absolutely confusing. As an Arabic speaker we were dealing with LTR and RTL together long time ago (far before Computers where invented) because our Arabic alphabetic is RTL while the Arabic numbers are LTR. So if I want to write the following sentience in Arabic "My salary is 321 Pound" I will write it like this: <image009.jpg> Or <image010.jpg> And not like this: <image011.jpg> Nor <image012.jpg> Since to any Arabic user the last two images means "My salary is 123 Pound"! Later, when computer was introduced in our region (1980s) we used to write English names within the Arabic text without chaining their direction. In other words, if I want to write the following sentence in Arabic (without Arabizing the English names) "We welcome the interest of Mr. André Schappo in the Arabic language" then I will write it like this: <image013.jpg> not like this: <image014.jpg> And definitely not like this: <image015.jpg> Moreover, when the internet was introduced (1990s) we used to write domains and email addresses in a similar manner and as what I have explained to you in my previous email. I hope that I have clarified the view of an Arabic speaker regarding your thoughts on how to handle RTL in LTR context and vice versa. BTW the following represent a sample of tons of examples form famous Newspapers inside the Arabic world: http://www.alriyadh.com/975687 <image016.jpg> http://www.albayan.ae/economy/last-deal/2011-12-09-1.1551679 <image017.jpg> http://aitmag.ahram.org.eg/News/38238.aspx <image018.jpg> With best regards, Raed From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andre Schappo Sent: Tuesday, May 08, 2018 6:49 PM To: ua-discuss@icann.org<mailto:ua-discuss@icann.org> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Time to revive a blog article which I wrote in March 2016😀 My blog article is about presentation of Arabic Email addresses ➜ schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> Using this presentation method would make components of an email address or domain name clearer even when mixing LTR and RTL André Schappo On 5 May 2018, at 13:03, Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: So how could any application process the domain name !! It will be RTL or LTR !! Ex Abdo.عبدو.Ahmed Where is the 1st label !!! Is it Abdo or Ahmed !!! Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!! Sent from my iPhone On May 5, 2018, at 11:50 AM, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: Hi, In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels. A -- Please excuse my clumbsy thums ________________________________ On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. عبدالمنعم-Abdo@سجل.مصر<mailto:%D8%B9%D8%A8%D8%AF%D8%A7%D9%84%D9%85%D9%86%D8%B9%D9%85-Abdo@%D8%B3%D8%AC%D9%84.%D9%85%D8%B5%D8%B1> Abdoعبدالمنعم.مصر … etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points in the same labels. Take a look here link<https://tools.ietf.org/html/rfc5564>. -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com<mailto:john.levine@standcore.com>>; Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>) <ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>>; ua-discuss@icann.org<mailto:ua-discuss@icann.org>; Ahmed Bakhat (ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>) <ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there? A -- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com<mailto:john.levine@standcore.com>> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com<mailto:john.levine@standcore.com> Standcore LLC
🌏 🌍 🌎 André Schappo 小山@电邮.在线?Subject=你好小山😜<mailto:%E5%B0%8F%E5%B1%B1@%E7%94%B5%E9%82%AE.%E5%9C%A8%E7%BA%BF?Subject=%E4%BD%A0%E5%A5%BD%E5%B0%8F%E5%B1%B1%F0%9F%98%9C> schappo.blogspot.co.uk<https://schappo.blogspot.co.uk/> twitter.com/andreschappo<https://twitter.com/andreschappo> weibo.com/andreschappo?is_all=1<https://weibo.com/andreschappo?is_all=1> groups.google.com/forum/#!forum/computer-science-curriculum-internationalization<https://groups.google.com/forum/#!forum/computer-science-curriculum-internat...>
Dear Robert & All, We believe mixing RTL with LTR labels/code-points in the domain and/or email (mailbox) will be confusable, not logical, not acceptable and not easeful to the Arabic user communities. Also it is not safe since it will confuse users and may be a playground for domain/email phishing. With an exceptional for digits (LTR) in Arabic label if the digits are in the middle or at the end of and RTL label. We have reach out this conclusion after many studies on the user expectation and understanding of a label that was combined of RTL & LTR in domains and email address. See the results below. Here is a summary and conclusion of our findings of our studies on mixing RTL and LTR in domains and email address (mailbox): 1. Mixing RTL and LTR within a label of a domain name or cross all the labels * The entire label(s) (as part of a domain name or cross the whole domain) should be formulated from a single script and a single direction (RTL or LTR) with the exception of digits (LTR) that can be in the middle or at the end of that label, i.e., no mixture of Arabic (RTL) and ASCII (LTR) code points within a domain name label or across all the domain labels. Thus, the following examples are not accepted: * givennameEMANRUS (Raedالفايز) = givenname+surname * EMANRUSgivenname (الفايزRaed) = surname+givenname * EMANRUS.givenname (Raed.الفايز) = givenname.surname * 123EMANRUS (123الفايز) = digits+surname * tld.EMANNIAMOD (sa.رسيل) = domainname.tld * DLT.domainname (raseel.السعودية) = domainname.tld 1. Mixing RTL and LTR within the user part of an email address (EAI) * It is the same as the previous point (mixing in domain labels), no mixing is allowed. Thus, the following examples are not accepted: · givenname.EMANRUS (Raed.الفايز) = givenname.surname · EMANNEVIG.surname (رائد.alfayez)= givenname.surname 1. Mixing RTL and LTR between domain and mailbox * The entire domain name part (i.e. all labels, e.g., domainname.tld) and the entire user part (the mailbox name, e.g. Givenname.Surname@) should be formulated from a single script (with the exception of digits with a condition (that are LTR)), i.e., no mixture of Arabic (RTL) and ASCII (LTR) code points at all. * Thus, some of the following examples are clear and understandable by Arabic users while others are not: User Direction Domain Direction Display format Real example (image) Clear to Arabic users? LTR LTR givenname.surname@domainname.tld<mailto:givenname.surname@domainname.tld> [cid:image001.jpg@01D2BA3A.8746F690] Yes RTL RTL DLT.EMANNIAMOD@EMANRUS.EMANNEVIG<mailto:DLT.EMANNIAMOD@EMANRUS.EMANNEVIG> [cid:image002.jpg@01D2BA3A.8746F690] Yes RTL LTR EMANRUS.EMANNEVIG@domainname.tld<mailto:EMANRUS.EMANNEVIG@domainname.tld> [cid:image003.jpg@01D2BA3A.8746F690] No LTR RTL DLT.EMANNIAMOD@givenname.surname<mailto:DLT.EMANNIAMOD@givenname.surname> [cid:image004.jpg@01D2BA3A.8746F690] No Please note, the last two rows are not easy to deal with, to implement, or to differentiate between mailbox and domain parts from reader point of view * It is desire that this rule is enforced at the protocol level (i.e., IDNA, EAI) or any other levels (e.g., OS, Applications ... etc. ). The rationale behind this rule is because the mixture will be confusable, not logical, not acceptable and not easeful to the Arabic user communities. 1. Display issues when having an RTL domain or email in LTR context (e.g. inserting an Arabic domain/email in an English article or vice versa): * RTL text should remain intact all the time regardless of the context. * the RTL mailbox part should be always as: EMANRUS.EMANNEVIG (example: رائد.الفايز) * the RTL domain name part should be always as: DLT.EMANNIAMOD (example: رسيل.السعودية) * LTR text should remain intact all the time regardless of the context. * the LTR mailbox part should be always as: givenname.surname (example: raed.alfayez) * the LTR domain name part should be always as: domainname.tld (example: raseel.sa) I hope I have provide some insight about the Arabic user expectations when we mix RTL and LTR in domain and EAI. Raed From: Roberto Gaetano [mailto:roberto_gaetano@hotmail.com] Sent: Wednesday, May 09, 2018 9:06 PM To: Raed AlFayez Cc: Andre Schappo; Universal Acceptance Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Thanks Raed. Just a further comment. There are situations in which an email address could have mixed scripts. For instance, some ASCII TLDs allow IDN at the second level. This can bring email addresses like this one (sorry, but I do not have the skills for making a graphic representation): <Arabic script user>@<Arabic script SLD>.<Latin script TLD> or maybe even: <Latin script user>@<Arabic script SLD>.<Latin script TLD> Can you please describe what would happen in this case? Thanks, Roberto On 09.05.2018, at 10:55, Raed AlFayez <rfayez@citc.gov.sa<mailto:rfayez@citc.gov.sa>> wrote: Dear Andre & All, Please allow me to resend my comments on the blog article that was shared by Andre in his last email: Dear Andre, With great interest and appreciation, I have read your posts in the UA-discuss mailing list as well as your recent blog article<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> on how to handle Arabic emails address in RTL and LTR context. Your suggestion sounds good at first glance. However, it is confusing and puzzling when you look at it from the point of view of an ordinary Arabic speaking user. Hence, I have the following comments that I would like first to share with you before posting them to the mailing list. I will be using examples to illustrate my point of view. They are demonstrated from native Arabic speaking point view. (Please note that I will be using pictures for the texts so that they will not be ruined when transferred by the email system) Consider my email address: rfayez@citc.gov.sa<mailto:rfayez@citc.gov.sa> As a (normal) user I can easily make out the following (correct) assumptions regardless of the text direction: 1. The user part is always to the left-side of the sign (@): rfayez 2. The domain name is always to the right-side of the sign (@): citc.gov.sa 3. A domain name is arranged in a well-defined label hierarchy where a TLD is always the rightmost label of the domain name: .sa So I will use my email address as is (rfayez@citc.gov.sa<mailto:rfayez@citc.gov.sa>) without changing its direction or swapping between its parts. Consider the following examples where my email address is used in different text writing directions: <image001.jpg> As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form (i.e., user@domain.TLD<mailto:user@domain.TLD>). This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples: care.sa@car.com<mailto:care.sa@car.com> car.com@care.sa<mailto:car.com@care.sa> will be straightforwardly interpreted as follows (no confusion whatsoever): <image002.jpg> Now let us repeat the examples using an Arabic email address: <image003.jpg> Here a native Arabic-speaking user would make the following assumptions as well regardless of the text direction: 1. The user part is always to the right-side of the sign (@): اندري 2. The domain name is always to the left-side of the sign (@): رسيل.السعودية 3. A domain name is arranged in a well-defined label hierarchy where an Arabic TLD is always the leftmost label of the domain name: .السعودية Therefore, the given Arabic email address: <image004.jpg> should be used without changing its direction or swapping between its parts to maintain its form and hence remove any confusion or misinterpretation. Consider the following examples where the previous email address is used in different text writing directions: <image005.jpg> As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form. This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples: <image006.jpg> will be straightforwardly interpreted as follows (no confusion whatsoever): <image007.jpg> However, if your suggestion is followed then the above email addresses will be used as follows depending in the text direction: <image008.jpg> and frankly this is absolutely confusing. As an Arabic speaker we were dealing with LTR and RTL together long time ago (far before Computers where invented) because our Arabic alphabetic is RTL while the Arabic numbers are LTR. So if I want to write the following sentience in Arabic "My salary is 321 Pound" I will write it like this: <image009.jpg> Or <image010.jpg> And not like this: <image011.jpg> Nor <image012.jpg> Since to any Arabic user the last two images means "My salary is 123 Pound"! Later, when computer was introduced in our region (1980s) we used to write English names within the Arabic text without chaining their direction. In other words, if I want to write the following sentence in Arabic (without Arabizing the English names) "We welcome the interest of Mr. André Schappo in the Arabic language" then I will write it like this: <image013.jpg> not like this: <image014.jpg> And definitely not like this: <image015.jpg> Moreover, when the internet was introduced (1990s) we used to write domains and email addresses in a similar manner and as what I have explained to you in my previous email. I hope that I have clarified the view of an Arabic speaker regarding your thoughts on how to handle RTL in LTR context and vice versa. BTW the following represent a sample of tons of examples form famous Newspapers inside the Arabic world: http://www.alriyadh.com/975687 <image016.jpg> http://www.albayan.ae/economy/last-deal/2011-12-09-1.1551679 <image017.jpg> http://aitmag.ahram.org.eg/News/38238.aspx <image018.jpg> With best regards, Raed From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andre Schappo Sent: Tuesday, May 08, 2018 6:49 PM To: ua-discuss@icann.org<mailto:ua-discuss@icann.org> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Time to revive a blog article which I wrote in March 2016😀 My blog article is about presentation of Arabic Email addresses ➜ schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> Using this presentation method would make components of an email address or domain name clearer even when mixing LTR and RTL André Schappo On 5 May 2018, at 13:03, Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: So how could any application process the domain name !! It will be RTL or LTR !! Ex Abdo.عبدو.Ahmed Where is the 1st label !!! Is it Abdo or Ahmed !!! Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!! Sent from my iPhone On May 5, 2018, at 11:50 AM, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: Hi, In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels. A -- Please excuse my clumbsy thums ________________________________ On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. عبدالمنعم-Abdo@سجل.مصر<mailto:%D8%B9%D8%A8%D8%AF%D8%A7%D9%84%D9%85%D9%86%D8%B9%D9%85-Abdo@%D8%B3%D8%AC%D9%84.%D9%85%D8%B5%D8%B1> Abdoعبدالمنعم.مصر … etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points in the same labels. Take a look here link<https://tools.ietf.org/html/rfc5564>. -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com<mailto:john.levine@standcore.com>>; Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>) <ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>>; ua-discuss@icann.org<mailto:ua-discuss@icann.org>; Ahmed Bakhat (ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>) <ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there? A -- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com<mailto:john.levine@standcore.com>> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com<mailto:john.levine@standcore.com> Standcore LLC
🌏 🌍 🌎 André Schappo 小山@电邮.在线?Subject=你好小山😜<mailto:%E5%B0%8F%E5%B1%B1@%E7%94%B5%E9%82%AE.%E5%9C%A8%E7%BA%BF?Subject=%E4%BD%A0%E5%A5%BD%E5%B0%8F%E5%B1%B1%F0%9F%98%9C> schappo.blogspot.co.uk<https://schappo.blogspot.co.uk/> twitter.com/andreschappo<https://twitter.com/andreschappo> weibo.com/andreschappo?is_all=1<https://weibo.com/andreschappo?is_all=1> groups.google.com/forum/#!forum/computer-science-curriculum-internationalization<https://groups.google.com/forum/#!forum/computer-science-curriculum-internat...>
Raed, et al… Thanks for the well documented discussion. You have identified good practice. Here are my thoughts: 1. Only ICANN can regulate this issue at the top level. 2. ICANN can only regulate this issue at the second level for new gTLDs 3. Individual registries can regulate this issue at the second level (or 3rd of the provide direct registration at those levels) 4. Just because something is confusing doesn’t mean someone won’t do it if there are no restrictions against it. And as we see with emojis, even RFCs don’t preclude activities that are contrary to published RFCs. I’m not sure where this should be documented. In the Unicode Consortium? I don’t think the IETF, but I may be wrong. Does it fit within the W3C? Perhaps the TF-AIDN? Would it be useful if the UASG published a Good Practice guide to BiDi in Domain Names and Email Addresses? (I think, Raed, that you’ve done the work) We could then update our existing documents to reference it. Or, if there’s someone else who’s got a good guide, we could reference that instead of building it afresh. The UASG documents touch on BiDi, but don’t go into any depth. Thoughts, please, from this group. Don From: UA-discuss <ua-discuss-bounces@icann.org> On Behalf Of Raed AlFayez Sent: Thursday, 10 May 2018 8:47 PM To: Roberto Gaetano <roberto_gaetano@hotmail.com> Cc: Universal Acceptance <ua-discuss@icann.org> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Dear Robert & All, We believe mixing RTL with LTR labels/code-points in the domain and/or email (mailbox) will be confusable, not logical, not acceptable and not easeful to the Arabic user communities. Also it is not safe since it will confuse users and may be a playground for domain/email phishing. With an exceptional for digits (LTR) in Arabic label if the digits are in the middle or at the end of and RTL label. We have reach out this conclusion after many studies on the user expectation and understanding of a label that was combined of RTL & LTR in domains and email address. See the results below. Here is a summary and conclusion of our findings of our studies on mixing RTL and LTR in domains and email address (mailbox): 1. Mixing RTL and LTR within a label of a domain name or cross all the labels * The entire label(s) (as part of a domain name or cross the whole domain) should be formulated from a single script and a single direction (RTL or LTR) with the exception of digits (LTR) that can be in the middle or at the end of that label, i.e., no mixture of Arabic (RTL) and ASCII (LTR) code points within a domain name label or across all the domain labels. Thus, the following examples are not accepted: * givennameEMANRUS (Raedالفايز) = givenname+surname * EMANRUSgivenname (الفايزRaed) = surname+givenname * EMANRUS.givenname (Raed.الفايز) = givenname.surname * 123EMANRUS (123الفايز) = digits+surname * tld.EMANNIAMOD (sa.رسيل) = domainname.tld * DLT.domainname (raseel.السعودية) = domainname.tld 1. Mixing RTL and LTR within the user part of an email address (EAI) * It is the same as the previous point (mixing in domain labels), no mixing is allowed. Thus, the following examples are not accepted: * givenname.EMANRUS (Raed.الفايز) = givenname.surname * EMANNEVIG.surname (رائد.alfayez)= givenname.surname 1. Mixing RTL and LTR between domain and mailbox * The entire domain name part (i.e. all labels, e.g., domainname.tld) and the entire user part (the mailbox name, e.g. Givenname.Surname@<mailto:Givenname.Surname@>) should be formulated from a single script (with the exception of digits with a condition (that are LTR)), i.e., no mixture of Arabic (RTL) and ASCII (LTR) code points at all. * Thus, some of the following examples are clear and understandable by Arabic users while others are not: User Direction Domain Direction Display format Real example (image) Clear to Arabic users? LTR LTR givenname.surname@domainname.tld<mailto:givenname.surname@domainname.tld> [cid:image001.jpg@01D2BA3A.8746F690] Yes RTL RTL DLT.EMANNIAMOD@EMANRUS.EMANNEVIG<mailto:DLT.EMANNIAMOD@EMANRUS.EMANNEVIG> [cid:image002.jpg@01D2BA3A.8746F690] Yes RTL LTR EMANRUS.EMANNEVIG@domainname.tld<mailto:EMANRUS.EMANNEVIG@domainname.tld> [cid:image003.jpg@01D2BA3A.8746F690] No LTR RTL DLT.EMANNIAMOD@givenname.surname<mailto:DLT.EMANNIAMOD@givenname.surname> [cid:image004.jpg@01D2BA3A.8746F690] No Please note, the last two rows are not easy to deal with, to implement, or to differentiate between mailbox and domain parts from reader point of view * It is desire that this rule is enforced at the protocol level (i.e., IDNA, EAI) or any other levels (e.g., OS, Applications ... etc. ). The rationale behind this rule is because the mixture will be confusable, not logical, not acceptable and not easeful to the Arabic user communities. 1. Display issues when having an RTL domain or email in LTR context (e.g. inserting an Arabic domain/email in an English article or vice versa): * RTL text should remain intact all the time regardless of the context. * the RTL mailbox part should be always as: EMANRUS.EMANNEVIG (example: رائد.الفايز) * the RTL domain name part should be always as: DLT.EMANNIAMOD (example: رسيل.السعودية) * LTR text should remain intact all the time regardless of the context. * the LTR mailbox part should be always as: givenname.surname (example: raed.alfayez) * the LTR domain name part should be always as: domainname.tld (example: raseel.sa) I hope I have provide some insight about the Arabic user expectations when we mix RTL and LTR in domain and EAI. Raed From: Roberto Gaetano [mailto:roberto_gaetano@hotmail.com] Sent: Wednesday, May 09, 2018 9:06 PM To: Raed AlFayez Cc: Andre Schappo; Universal Acceptance Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Thanks Raed. Just a further comment. There are situations in which an email address could have mixed scripts. For instance, some ASCII TLDs allow IDN at the second level. This can bring email addresses like this one (sorry, but I do not have the skills for making a graphic representation): <Arabic script user>@<Arabic script SLD>.<Latin script TLD> or maybe even: <Latin script user>@<Arabic script SLD>.<Latin script TLD> Can you please describe what would happen in this case? Thanks, Roberto On 09.05.2018, at 10:55, Raed AlFayez <rfayez@citc.gov.sa<mailto:rfayez@citc.gov.sa>> wrote: Dear Andre & All, Please allow me to resend my comments on the blog article that was shared by Andre in his last email: Dear Andre, With great interest and appreciation, I have read your posts in the UA-discuss mailing list as well as your recent blog article<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> on how to handle Arabic emails address in RTL and LTR context. Your suggestion sounds good at first glance. However, it is confusing and puzzling when you look at it from the point of view of an ordinary Arabic speaking user. Hence, I have the following comments that I would like first to share with you before posting them to the mailing list. I will be using examples to illustrate my point of view. They are demonstrated from native Arabic speaking point view. (Please note that I will be using pictures for the texts so that they will not be ruined when transferred by the email system) Consider my email address: rfayez@citc.gov.sa<mailto:rfayez@citc.gov.sa> As a (normal) user I can easily make out the following (correct) assumptions regardless of the text direction: 1. The user part is always to the left-side of the sign (@): rfayez 2. The domain name is always to the right-side of the sign (@): citc.gov.sa 3. A domain name is arranged in a well-defined label hierarchy where a TLD is always the rightmost label of the domain name: .sa So I will use my email address as is (rfayez@citc.gov.sa<mailto:rfayez@citc.gov.sa>) without changing its direction or swapping between its parts. Consider the following examples where my email address is used in different text writing directions: <image001.jpg> As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form (i.e., user@domain.TLD<mailto:user@domain.TLD>). This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples: care.sa@car.com<mailto:care.sa@car.com> car.com@care.sa<mailto:car.com@care.sa> will be straightforwardly interpreted as follows (no confusion whatsoever): <image002.jpg> Now let us repeat the examples using an Arabic email address: <image003.jpg> Here a native Arabic-speaking user would make the following assumptions as well regardless of the text direction: 1. The user part is always to the right-side of the sign (@): اندري 2. The domain name is always to the left-side of the sign (@): رسيل.السعودية 3. A domain name is arranged in a well-defined label hierarchy where an Arabic TLD is always the leftmost label of the domain name: .السعودية Therefore, the given Arabic email address: <image004.jpg> should be used without changing its direction or swapping between its parts to maintain its form and hence remove any confusion or misinterpretation. Consider the following examples where the previous email address is used in different text writing directions: <image005.jpg> As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form. This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples: <image006.jpg> will be straightforwardly interpreted as follows (no confusion whatsoever): <image007.jpg> However, if your suggestion is followed then the above email addresses will be used as follows depending in the text direction: <image008.jpg> and frankly this is absolutely confusing. As an Arabic speaker we were dealing with LTR and RTL together long time ago (far before Computers where invented) because our Arabic alphabetic is RTL while the Arabic numbers are LTR. So if I want to write the following sentience in Arabic "My salary is 321 Pound" I will write it like this: <image009.jpg> Or <image010.jpg> And not like this: <image011.jpg> Nor <image012.jpg> Since to any Arabic user the last two images means "My salary is 123 Pound"! Later, when computer was introduced in our region (1980s) we used to write English names within the Arabic text without chaining their direction. In other words, if I want to write the following sentence in Arabic (without Arabizing the English names) "We welcome the interest of Mr. André Schappo in the Arabic language" then I will write it like this: <image013.jpg> not like this: <image014.jpg> And definitely not like this: <image015.jpg> Moreover, when the internet was introduced (1990s) we used to write domains and email addresses in a similar manner and as what I have explained to you in my previous email. I hope that I have clarified the view of an Arabic speaker regarding your thoughts on how to handle RTL in LTR context and vice versa. BTW the following represent a sample of tons of examples form famous Newspapers inside the Arabic world: http://www.alriyadh.com/975687 <image016.jpg> http://www.albayan.ae/economy/last-deal/2011-12-09-1.1551679 <image017.jpg> http://aitmag.ahram.org.eg/News/38238.aspx <image018.jpg> With best regards, Raed From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andre Schappo Sent: Tuesday, May 08, 2018 6:49 PM To: ua-discuss@icann.org<mailto:ua-discuss@icann.org> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Time to revive a blog article which I wrote in March 2016😀 My blog article is about presentation of Arabic Email addresses ➜ schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html<http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> Using this presentation method would make components of an email address or domain name clearer even when mixing LTR and RTL André Schappo On 5 May 2018, at 13:03, Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: So how could any application process the domain name !! It will be RTL or LTR !! Ex Abdo.عبدو.Ahmed Where is the 1st label !!! Is it Abdo or Ahmed !!! Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!! Sent from my iPhone On May 5, 2018, at 11:50 AM, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: Hi, In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels. A -- Please excuse my clumbsy thums ________________________________ On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> wrote: Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. عبدالمنعم-Abdo@سجل.مصر<mailto:%D8%B9%D8%A8%D8%AF%D8%A7%D9%84%D9%85%D9%86%D8%B9%D9%85-Abdo@%D8%B3%D8%AC%D9%84.%D9%85%D8%B5%D8%B1> Abdoعبدالمنعم.مصر … etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points in the same labels. Take a look here link<https://tools.ietf.org/html/rfc5564>. -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine <john.levine@standcore.com<mailto:john.levine@standcore.com>>; Abdalmonem Tharwat Galila <agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>> Cc: Ahmed Bakhat Masood (ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>) <ahmedbakhat@pta.gov.pk<mailto:ahmedbakhat@pta.gov.pk>>; ua-discuss@icann.org<mailto:ua-discuss@icann.org>; Ahmed Bakhat (ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>) <ahmedbakhat@yahoo.com<mailto:ahmedbakhat@yahoo.com>> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there? A -- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" <john.levine@standcore.com<mailto:john.levine@standcore.com>> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic IDNs", I got the following regards to mixing LTR and RTL texts within the same label. - Mixing between different scripts is not allowed for domain names and email addresses - Numbers at the middle or at the end of the RTL domain name is allowed.
To be away from the display issues we get if we mix RTL and LTR code points in the same labels.
Thanks. I think this clarifies the point that we have no advice on displaying e-mail addresses, since mailboxes are not domain names and are not labels and are not subject to IDNA2008.
Regards, John Levine, john.levine@standcore.com<mailto:john.levine@standcore.com> Standcore LLC
🌏 🌍 🌎 André Schappo 小山@电邮.在线?Subject=你好小山😜<mailto:%E5%B0%8F%E5%B1%B1@%E7%94%B5%E9%82%AE.%E5%9C%A8%E7%BA%BF?Subject=%E4%BD%A0%E5%A5%BD%E5%B0%8F%E5%B1%B1%F0%9F%98%9C> schappo.blogspot.co.uk<https://schappo.blogspot.co.uk/> twitter.com/andreschappo<https://twitter.com/andreschappo> weibo.com/andreschappo?is_all=1<https://weibo.com/andreschappo?is_all=1> groups.google.com/forum/#!forum/computer-science-curriculum-internationalization<https://groups.google.com/forum/#!forum/computer-science-curriculum-internat...>
I think it should be a good idea: ======================================== Would it be useful if the UASG published a Good Practice guide to BiDi in Domain Names and Email Addresses? (I think, Raed, that you’ve done the work) We could then update our existing documents to reference it. Or, if there’s someone else who’s got a good guide, we could reference that instead of building it afresh. The UASG documents touch on BiDi, but don’t go into any depth. ======================================== And perhaps updating the existing UASG docs where appropriate. Edmon From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Don Hollander Sent: Thursday, May 10, 2018 6:44 PM To: Raed AlFayez <rfayez@citc.gov.sa>; Roberto Gaetano <roberto_gaetano@hotmail.com> Cc: Universal Acceptance <ua-discuss@icann.org> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Raed, et al… Thanks for the well documented discussion. You have identified good practice. Here are my thoughts: 1) Only ICANN can regulate this issue at the top level. 2) ICANN can only regulate this issue at the second level for new gTLDs 3) Individual registries can regulate this issue at the second level (or 3rd of the provide direct registration at those levels) 4) Just because something is confusing doesn’t mean someone won’t do it if there are no restrictions against it. And as we see with emojis, even RFCs don’t preclude activities that are contrary to published RFCs. I’m not sure where this should be documented. In the Unicode Consortium? I don’t think the IETF, but I may be wrong. Does it fit within the W3C? Perhaps the TF-AIDN? Would it be useful if the UASG published a Good Practice guide to BiDi in Domain Names and Email Addresses? (I think, Raed, that you’ve done the work) We could then update our existing documents to reference it. Or, if there’s someone else who’s got a good guide, we could reference that instead of building it afresh. The UASG documents touch on BiDi, but don’t go into any depth. Thoughts, please, from this group. Don From: UA-discuss <ua-discuss-bounces@icann.org <mailto:ua-discuss-bounces@icann.org> > On Behalf Of Raed AlFayez Sent: Thursday, 10 May 2018 8:47 PM To: Roberto Gaetano <roberto_gaetano@hotmail.com <mailto:roberto_gaetano@hotmail.com> > Cc: Universal Acceptance <ua-discuss@icann.org <mailto:ua-discuss@icann.org> > Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Dear Robert & All, We believe mixing RTL with LTR labels/code-points in the domain and/or email (mailbox) will be confusable, not logical, not acceptable and not easeful to the Arabic user communities. Also it is not safe since it will confuse users and may be a playground for domain/email phishing. With an exceptional for digits (LTR) in Arabic label if the digits are in the middle or at the end of and RTL label. We have reach out this conclusion after many studies on the user expectation and understanding of a label that was combined of RTL & LTR in domains and email address. See the results below. Here is a summary and conclusion of our findings of our studies on mixing RTL and LTR in domains and email address (mailbox): I. Mixing RTL and LTR within a label of a domain name or cross all the labels a. The entire label(s) (as part of a domain name or cross the whole domain) should be formulated from a single script and a single direction (RTL or LTR) with the exception of digits (LTR) that can be in the middle or at the end of that label, i.e., no mixture of Arabic (RTL) and ASCII (LTR) code points within a domain name label or across all the domain labels. Thus, the following examples are not accepted: * givennameEMANRUS (Raedالفايز) = givenname+surname * EMANRUSgivenname (الفايزRaed) = surname+givenname * EMANRUS.givenname (Raed.الفايز) = givenname.surname * 123EMANRUS (123الفايز) = digits+surname * tld.EMANNIAMOD (sa.رسيل) = domainname.tld * DLT.domainname (raseel.السعودية) = domainname.tld II. Mixing RTL and LTR within the user part of an email address (EAI) a. It is the same as the previous point (mixing in domain labels), no mixing is allowed. Thus, the following examples are not accepted: * givenname.EMANRUS (Raed.الفايز) = givenname.surname * EMANNEVIG.surname (رائد.alfayez)= givenname.surname III. Mixing RTL and LTR between domain and mailbox a. The entire domain name part (i.e. all labels, e.g., domainname.tld) and the entire user part (the mailbox name, e.g. Givenname.Surname@ <mailto:Givenname.Surname@> ) should be formulated from a single script (with the exception of digits with a condition (that are LTR)), i.e., no mixture of Arabic (RTL) and ASCII (LTR) code points at all. b. Thus, some of the following examples are clear and understandable by Arabic users while others are not: User Direction Domain Direction Display format Real example (image) Clear to Arabic users? LTR LTR givenname.surname@domainname.tld <mailto:givenname.surname@domainname.tld> Yes RTL RTL DLT.EMANNIAMOD@EMANRUS.EMANNEVIG <mailto:DLT.EMANNIAMOD@EMANRUS.EMANNEVIG> Yes RTL LTR EMANRUS.EMANNEVIG@domainname.tld <mailto:EMANRUS.EMANNEVIG@domainname.tld> No LTR RTL DLT.EMANNIAMOD@givenname.surname <mailto:DLT.EMANNIAMOD@givenname.surname> No Please note, the last two rows are not easy to deal with, to implement, or to differentiate between mailbox and domain parts from reader point of view b. It is desire that this rule is enforced at the protocol level (i.e., IDNA, EAI) or any other levels (e.g., OS, Applications ... etc. ). The rationale behind this rule is because the mixture will be confusable, not logical, not acceptable and not easeful to the Arabic user communities. IV. Display issues when having an RTL domain or email in LTR context (e.g. inserting an Arabic domain/email in an English article or vice versa): a. RTL text should remain intact all the time regardless of the context. i. the RTL mailbox part should be always as: EMANRUS.EMANNEVIG (example: رائد.الفايز) ii. the RTL domain name part should be always as: DLT.EMANNIAMOD (example: رسيل.السعودية) b. LTR text should remain intact all the time regardless of the context. ii. the LTR mailbox part should be always as: givenname.surname (example: raed.alfayez) iii. the LTR domain name part should be always as: domainname.tld (example: raseel.sa) I hope I have provide some insight about the Arabic user expectations when we mix RTL and LTR in domain and EAI. Raed From: Roberto Gaetano [mailto:roberto_gaetano@hotmail.com] Sent: Wednesday, May 09, 2018 9:06 PM To: Raed AlFayez Cc: Andre Schappo; Universal Acceptance Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Thanks Raed. Just a further comment. There are situations in which an email address could have mixed scripts. For instance, some ASCII TLDs allow IDN at the second level. This can bring email addresses like this one (sorry, but I do not have the skills for making a graphic representation): <Arabic script user>@<Arabic script SLD>.<Latin script TLD> or maybe even: <Latin script user>@<Arabic script SLD>.<Latin script TLD> Can you please describe what would happen in this case? Thanks, Roberto On 09.05.2018, at 10:55, Raed AlFayez <rfayez@citc.gov.sa <mailto:rfayez@citc.gov.sa> > wrote: Dear Andre & All, Please allow me to resend my comments on the blog article that was shared by Andre in his last email: Dear Andre, With great interest and appreciation, I have read your posts in the UA-discuss mailing list as well as your recent <http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> blog article on how to handle Arabic emails address in RTL and LTR context. Your suggestion sounds good at first glance. However, it is confusing and puzzling when you look at it from the point of view of an ordinary Arabic speaking user. Hence, I have the following comments that I would like first to share with you before posting them to the mailing list. I will be using examples to illustrate my point of view. They are demonstrated from native Arabic speaking point view. (Please note that I will be using pictures for the texts so that they will not be ruined when transferred by the email system) Consider my email address: <mailto:rfayez@citc.gov.sa> rfayez@citc.gov.sa As a (normal) user I can easily make out the following (correct) assumptions regardless of the text direction: 1. The user part is always to the left-side of the sign (@): rfayez 2. The domain name is always to the right-side of the sign (@): citc.gov.sa 3. A domain name is arranged in a well-defined label hierarchy where a TLD is always the rightmost label of the domain name: .sa So I will use my email address as is ( <mailto:rfayez@citc.gov.sa> rfayez@citc.gov.sa) without changing its direction or swapping between its parts. Consider the following examples where my email address is used in different text writing directions: <image001.jpg> As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form (i.e., <mailto:user@domain.TLD> user@domain.TLD). This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples: <mailto:care.sa@car.com> care.sa@car.com <mailto:car.com@care.sa> car.com@care.sa will be straightforwardly interpreted as follows (no confusion whatsoever): <image002.jpg> Now let us repeat the examples using an Arabic email address: <image003.jpg> Here a native Arabic-speaking user would make the following assumptions as well regardless of the text direction: 1. The user part is always to the right-side of the sign (@): اندري 2. The domain name is always to the left-side of the sign (@): رسيل.السعودية 3. A domain name is arranged in a well-defined label hierarchy where an Arabic TLD is always the leftmost label of the domain name: .السعودية Therefore, the given Arabic email address: <image004.jpg> should be used without changing its direction or swapping between its parts to maintain its form and hence remove any confusion or misinterpretation. Consider the following examples where the previous email address is used in different text writing directions: <image005.jpg> As you can see, regardless of the text direction (LTR or RTL) the email address maintain its form. This allows the user easily construct and deconstruct email addresses correctly without confusing and mixing up its parts. For example, the following set of examples: <image006.jpg> will be straightforwardly interpreted as follows (no confusion whatsoever): <image007.jpg> However, if your suggestion is followed then the above email addresses will be used as follows depending in the text direction: <image008.jpg> and frankly this is absolutely confusing. As an Arabic speaker we were dealing with LTR and RTL together long time ago (far before Computers where invented) because our Arabic alphabetic is RTL while the Arabic numbers are LTR. So if I want to write the following sentience in Arabic "My salary is 321 Pound" I will write it like this: <image009.jpg> Or <image010.jpg> And not like this: <image011.jpg> Nor <image012.jpg> Since to any Arabic user the last two images means "My salary is 123 Pound"! Later, when computer was introduced in our region (1980s) we used to write English names within the Arabic text without chaining their direction. In other words, if I want to write the following sentence in Arabic (without Arabizing the English names) "We welcome the interest of Mr. André Schappo in the Arabic language" then I will write it like this: <image013.jpg> not like this: <image014.jpg> And definitely not like this: <image015.jpg> Moreover, when the internet was introduced (1990s) we used to write domains and email addresses in a similar manner and as what I have explained to you in my previous email. I hope that I have clarified the view of an Arabic speaker regarding your thoughts on how to handle RTL in LTR context and vice versa. BTW the following represent a sample of tons of examples form famous Newspapers inside the Arabic world: <http://www.alriyadh.com/975687> http://www.alriyadh.com/975687 <image016.jpg> <http://www.albayan.ae/economy/last-deal/2011-12-09-1.1551679> http://www.albayan.ae/economy/last-deal/2011-12-09-1.1551679 <image017.jpg> <http://aitmag.ahram.org.eg/News/38238.aspx> http://aitmag.ahram.org.eg/News/38238.aspx <image018.jpg> With best regards, Raed From: UA-discuss [ <mailto:ua-discuss-bounces@icann.org> mailto:ua-discuss-bounces@icann.org] On Behalf Of Andre Schappo Sent: Tuesday, May 08, 2018 6:49 PM To: <mailto:ua-discuss@icann.org> ua-discuss@icann.org Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Time to revive a blog article which I wrote in March 2016😀 My blog article is about presentation of Arabic Email addresses ➜ <http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html> schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html Using this presentation method would make components of an email address or domain name clearer even when mixing LTR and RTL André Schappo On 5 May 2018, at 13:03, Abdalmonem Tharwat Galila < <mailto:agalila@mcit.gov.eg> agalila@mcit.gov.eg> wrote: So how could any application process the domain name !! It will be RTL or LTR !! Ex Abdo.عبدو.Ahmed Where is the 1st label !!! Is it Abdo or Ahmed !!! Consider if the domain name starts with RTL text !! Or RTL in the middle !!! Or at the end !!! Sent from my iPhone On May 5, 2018, at 11:50 AM, Andrew Sullivan < <mailto:ajs@anvilwalrusden.com> ajs@anvilwalrusden.com> wrote: Hi, In the same label, it's mostly a bad idea (there's some discussion of this in the bidi document). But my point was about domain names, not individual labels. A -- Please excuse my clumbsy thums _____ On May 5, 2018 04:29:05 Abdalmonem Tharwat Galila < <mailto:agalila@mcit.gov.eg> agalila@mcit.gov.eg> wrote: Hi Andrew, Thanks for your below reply , I spend a lot of time try to do some mixing examples between RTL and LTR within the same label, what I got is strange and unclear Label as a result for ex. <mailto:%D8%B9%D8%A8%D8%AF%D8%A7%D9%84%D9%85%D9%86%D8%B9%D9%85-Abdo@%D8%B3%D8%AC%D9%84.%D9%85%D8%B5%D8%B1> عبدالمنعم-Abdo@سجل.مصر Abdoعبدالمنعم.مصر … etc Many issues you cannot imagine , also another thing using dot in RTL context or in LTR context will give you different labels although they must be the same. To be away from the display issues we get if we mix RTL and LTR code points in the same labels. Take a look here <https://tools.ietf.org/html/rfc5564> link. -----Original Message----- From: UA-discuss [ <mailto:ua-discuss-bounces@icann.org> mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Friday, May 04, 2018 2:57 PM To: John Levine < <mailto:john.levine@standcore.com> john.levine@standcore.com>; Abdalmonem Tharwat Galila < <mailto:agalila@mcit.gov.eg> agalila@mcit.gov.eg> Cc: Ahmed Bakhat Masood ( <mailto:ahmedbakhat@pta.gov.pk> ahmedbakhat@pta.gov.pk) < <mailto:ahmedbakhat@pta.gov.pk> ahmedbakhat@pta.gov.pk>; <mailto:ua-discuss@icann.org> ua-discuss@icann.org; Ahmed Bakhat ( <mailto:ahmedbakhat@yahoo.com> ahmedbakhat@yahoo.com) < <mailto:ahmedbakhat@yahoo.com> ahmedbakhat@yahoo.com> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Also I don't know how you disallow script mixing for domain names. IDNA is label by label. The DNS is distributed, so there's no way to prevent mixing, is there? A -- Please excuse my clumbsy thums ---------- On May 4, 2018 05:36:02 "John Levine" < <mailto:john.levine@standcore.com> john.levine@standcore.com> wrote:
I hope you all doing well, after back to TF-AIDN "Task Force Of Arabic
IDNs", I got the following regards to mixing LTR and RTL texts within the
same label.
- Mixing between different scripts is not allowed for domain names
and email addresses
- Numbers at the middle or at the end of the RTL domain name is
allowed.
To be away from the display issues we get if we mix RTL and LTR code points
in the same labels.
Thanks. I think this clarifies the point that we have no advice on
displaying e-mail addresses, since mailboxes are not domain names and are
not labels and are not subject to IDNA2008.
Regards,
John Levine, <mailto:john.levine@standcore.com> john.levine@standcore.com
Standcore LLC
🌏 🌍 🌎 André Schappo <mailto:%E5%B0%8F%E5%B1%B1@%E7%94%B5%E9%82%AE.%E5%9C%A8%E7%BA%BF?Subject=%E4%BD%A0%E5%A5%BD%E5%B0%8F%E5%B1%B1%F0%9F%98%9C> 小山@电邮.在线?Subject=你好小山😜 <https://schappo.blogspot.co.uk/> schappo.blogspot.co.uk <https://twitter.com/andreschappo> twitter.com/andreschappo <https://weibo.com/andreschappo?is_all=1> weibo.com/andreschappo?is_all=1 <https://groups.google.com/forum/#!forum/computer-science-curriculum-internat...> groups.google.com/forum/#!forum/computer-science-curriculum-internationalization
On Thu, May 10, 2018 at 10:44:18AM +0000, Don Hollander wrote:
I’m not sure where this should be documented. In the Unicode Consortium? I don’t think the IETF, but I may be wrong. Does it fit within the W3C?
It doesn't fit in any of these. The IDNA RFCs already say that you need to be careful about this, but they note, quite correctly, that there is literally no way to make a rule about entire domain names (as we have noted in this thread too). It would be possible to publish an RFC, if people really wanted to, restating that advice more strongly. But it's still at best going to be advice. And unfortunately, some domain names (ones that happen probably not to be aimed at humans) are _guaranteed_ to have bidi problems across the whole name in U-label form, because SRV and such like labels are going to have ASCII components because of the identifiers of protocols (like _tcp).
Would it be useful if the UASG published a Good Practice guide to BiDi in Domain Names and Email Addresses? (I think, Raed, that you’ve done the work) We could then update our existing documents to reference it. Or, if there’s someone else who’s got a good guide, we could reference that instead of building it afresh.
I think that would be a fine thing to do, but I think that you probably need someone with serious depth in the DNS and email protocols to help with it in order to make sure it doesn't run afoul of the details of the protocols. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
I agree that there is a place for the UASG to put forth a Good Practice guide that is intended to be read by implementers of limited (or greater) experience. I suggest that an Informational RFC is a good idea that covers the depth of content that Andrew alluded to. Having the two complementary documents will keep us from creating a single document that fails to adequately address the variety of audiences that we target. --Rich Richard Merdinger VP, Domains rmerdinger@godaddy.com On 5/10/18, 6:52 AM, "UA-discuss on behalf of Andrew Sullivan" <ua-discuss-bounces@icann.org on behalf of ajs@anvilwalrusden.com> wrote: On Thu, May 10, 2018 at 10:44:18AM +0000, Don Hollander wrote: > > I’m not sure where this should be documented. In the Unicode Consortium? I > don’t think the IETF, but I may be wrong. Does it fit within the W3C? It doesn't fit in any of these. The IDNA RFCs already say that you need to be careful about this, but they note, quite correctly, that there is literally no way to make a rule about entire domain names (as we have noted in this thread too). It would be possible to publish an RFC, if people really wanted to, restating that advice more strongly. But it's still at best going to be advice. And unfortunately, some domain names (ones that happen probably not to be aimed at humans) are _guaranteed_ to have bidi problems across the whole name in U-label form, because SRV and such like labels are going to have ASCII components because of the identifiers of protocols (like _tcp). > Would it be useful if the UASG published a Good Practice guide to BiDi in > Domain Names and Email Addresses? (I think, Raed, that you’ve done the work) > We could then update our existing documents to reference it. Or, if there’s > someone else who’s got a good guide, we could reference that instead of > building it afresh. > I think that would be a fine thing to do, but I think that you probably need someone with serious depth in the DNS and email protocols to help with it in order to make sure it doesn't run afoul of the details of the protocols. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
Thanks Rich, Andrew. I quite like the idea of an 'informational RFC'. I think having it in the IETF's documentation space will provide greater access to those who could find it useful. Andrew, is this possible? If so, how do we go about getting such underway? And/or, should we also publishing something? As an interim basis? Or perhaps the TF-AIDN could? Raed: Would you be willing to drive this? Don -----Original Message----- From: UA-discuss <ua-discuss-bounces@icann.org> On Behalf Of Richard Merdinger Sent: Friday, 11 May 2018 2:12 AM To: Andrew Sullivan <ajs@anvilwalrusden.com>; ua-discuss@icann.org Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts I agree that there is a place for the UASG to put forth a Good Practice guide that is intended to be read by implementers of limited (or greater) experience. I suggest that an Informational RFC is a good idea that covers the depth of content that Andrew alluded to. Having the two complementary documents will keep us from creating a single document that fails to adequately address the variety of audiences that we target. --Rich Richard Merdinger VP, Domains rmerdinger@godaddy.com On 5/10/18, 6:52 AM, "UA-discuss on behalf of Andrew Sullivan" <ua-discuss-bounces@icann.org on behalf of ajs@anvilwalrusden.com> wrote: On Thu, May 10, 2018 at 10:44:18AM +0000, Don Hollander wrote: > > I’m not sure where this should be documented. In the Unicode Consortium? I > don’t think the IETF, but I may be wrong. Does it fit within the W3C? It doesn't fit in any of these. The IDNA RFCs already say that you need to be careful about this, but they note, quite correctly, that there is literally no way to make a rule about entire domain names (as we have noted in this thread too). It would be possible to publish an RFC, if people really wanted to, restating that advice more strongly. But it's still at best going to be advice. And unfortunately, some domain names (ones that happen probably not to be aimed at humans) are _guaranteed_ to have bidi problems across the whole name in U-label form, because SRV and such like labels are going to have ASCII components because of the identifiers of protocols (like _tcp). > Would it be useful if the UASG published a Good Practice guide to BiDi in > Domain Names and Email Addresses? (I think, Raed, that you’ve done the work) > We could then update our existing documents to reference it. Or, if there’s > someone else who’s got a good guide, we could reference that instead of > building it afresh. > I think that would be a fine thing to do, but I think that you probably need someone with serious depth in the DNS and email protocols to help with it in order to make sure it doesn't run afoul of the details of the protocols. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
On 5/10/18 12:19 PM, Don Hollander wrote:
Thanks Rich, Andrew.
I quite like the idea of an 'informational RFC'. I think having it in the IETF's documentation space will provide greater access to those who could find it useful.
Andrew, is this possible?
If only you knew. :-)
If so, how do we go about getting such underway?
Andrew and I have a lot of experience in this area and could discuss next steps. Peter
On Thu, May 10, 2018 at 06:19:42PM +0000, Don Hollander wrote:
Andrew, is this possible? If so, how do we go about getting such underway?
Write an Internet-Draft and upload it to the Internet-Drafts repository and get some review. My bet is that this will either have to be AD sponsored or Independent Submission, but maybe it can go through one of the area working groups. The first trick is to get the I-D written, though. There are a lot of tools for this, none of them very friendly. Many people like https://github.com/miekg/pandoc2rfc/ A -- Andrew Sullivan ajs@anvilwalrusden.com
My experience of RFCs is that they are ASCII and that makes it rather difficult to give Arabic examples. The last thing I encountered is https://tools.ietf.org/pdf/rfc7997.pdf which proposes usage of Unicode UTF-8 instead of ASCII What is the current state of play for using Unicode UTF8 in RFCs? André Schappo On 10 May 2018, at 19:30, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: On Thu, May 10, 2018 at 06:19:42PM +0000, Don Hollander wrote: Andrew, is this possible? If so, how do we go about getting such underway? Write an Internet-Draft and upload it to the Internet-Drafts repository and get some review. My bet is that this will either have to be AD sponsored or Independent Submission, but maybe it can go through one of the area working groups. The first trick is to get the I-D written, though. There are a lot of tools for this, none of them very friendly. Many people like https://github.com/miekg/pandoc2rfc/ A -- Andrew Sullivan ajs@anvilwalrusden.com
We'd be an excellent pilot case. -- Please excuse my clumbsy thums On May 11, 2018 05:17:31 Andre Schappo <A.Schappo@lboro.ac.uk> wrote:
My experience of RFCs is that they are ASCII and that makes it rather difficult to give Arabic examples.
The last thing I encountered is https://tools.ietf.org/pdf/rfc7997.pdf which proposes usage of Unicode UTF-8 instead of ASCII
What is the current state of play for using Unicode UTF8 in RFCs?
André Schappo
On 10 May 2018, at 19:30, Andrew Sullivan <ajs@anvilwalrusden.com> wrote:
On Thu, May 10, 2018 at 06:19:42PM +0000, Don Hollander wrote:
Andrew, is this possible? If so, how do we go about getting such underway?
Write an Internet-Draft and upload it to the Internet-Drafts repository and get some review. My bet is that this will either have to be AD sponsored or Independent Submission, but maybe it can go through one of the area working groups.
The first trick is to get the I-D written, though. There are a lot of tools for this, none of them very friendly. Many people like https://github.com/miekg/pandoc2rfc/
A
-- Andrew Sullivan ajs@anvilwalrusden.com
I have already published some RFCs containing non-ASCII characters (but not yet RTL text). See for instance RFC 8265. Peter Sent from mobile, might be terse
On May 11, 2018, at 7:05 AM, Andrew Sullivan <ajs@anvilwalrusden.com> wrote:
We'd be an excellent pilot case.
-- Please excuse my clumbsy thums
On May 11, 2018 05:17:31 Andre Schappo <A.Schappo@lboro.ac.uk> wrote:
My experience of RFCs is that they are ASCII and that makes it rather difficult to give Arabic examples.
The last thing I encountered is https://tools.ietf.org/pdf/rfc7997.pdf which proposes usage of Unicode UTF-8 instead of ASCII
What is the current state of play for using Unicode UTF8 in RFCs?
André Schappo
On 10 May 2018, at 19:30, Andrew Sullivan <ajs@anvilwalrusden.com> wrote:
On Thu, May 10, 2018 at 06:19:42PM +0000, Don Hollander wrote:
Andrew, is this possible? If so, how do we go about getting such underway?
Write an Internet-Draft and upload it to the Internet-Drafts repository and get some review. My bet is that this will either have to be AD sponsored or Independent Submission, but maybe it can go through one of the area working groups.
The first trick is to get the I-D written, though. There are a lot of tools for this, none of them very friendly. Many people like https://github.com/miekg/pandoc2rfc/
A
-- Andrew Sullivan ajs@anvilwalrusden.com
GoDaddy might be able to apply some resource to help here. Don, Do you want me to engage with our primary IETF rep on this? He shepherds our RFC work and should be able to help guide this. --Rich Richard Merdinger VP, Domains rmerdinger@godaddy.com<mailto:rmerdinger@godaddy.com> From: UA-discuss <ua-discuss-bounces@icann.org> on behalf of Peter Saint-Andre <stpeter@mozilla.com> Date: Friday, May 11, 2018 at 8:14 AM To: Andrew Sullivan <ajs@anvilwalrusden.com> Cc: "ua-discuss@icann.org" <ua-discuss@icann.org> Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts I have already published some RFCs containing non-ASCII characters (but not yet RTL text). See for instance RFC 8265. Peter Sent from mobile, might be terse On May 11, 2018, at 7:05 AM, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: We'd be an excellent pilot case. -- Please excuse my clumbsy thums ________________________________ On May 11, 2018 05:17:31 Andre Schappo <A.Schappo@lboro.ac.uk<mailto:A.Schappo@lboro.ac.uk>> wrote: My experience of RFCs is that they are ASCII and that makes it rather difficult to give Arabic examples. The last thing I encountered is https://tools.ietf.org/pdf/rfc7997.pdf which proposes usage of Unicode UTF-8 instead of ASCII What is the current state of play for using Unicode UTF8 in RFCs? André Schappo On 10 May 2018, at 19:30, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: On Thu, May 10, 2018 at 06:19:42PM +0000, Don Hollander wrote: Andrew, is this possible? If so, how do we go about getting such underway? Write an Internet-Draft and upload it to the Internet-Drafts repository and get some review. My bet is that this will either have to be AD sponsored or Independent Submission, but maybe it can go through one of the area working groups. The first trick is to get the I-D written, though. There are a lot of tools for this, none of them very friendly. Many people like https://github.com/miekg/pandoc2rfc/ A -- Andrew Sullivan ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>
Hi Don.. I like your idea and I can help in reviewing and may be help in drafting parts of the document. Raed -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Don Hollander Sent: Thursday, May 10, 2018 9:20 PM To: Richard Merdinger; Andrew Sullivan; ua-discuss@icann.org Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts Thanks Rich, Andrew. I quite like the idea of an 'informational RFC'. I think having it in the IETF's documentation space will provide greater access to those who could find it useful. Andrew, is this possible? If so, how do we go about getting such underway? And/or, should we also publishing something? As an interim basis? Or perhaps the TF-AIDN could? Raed: Would you be willing to drive this? Don -----Original Message----- From: UA-discuss <ua-discuss-bounces@icann.org> On Behalf Of Richard Merdinger Sent: Friday, 11 May 2018 2:12 AM To: Andrew Sullivan <ajs@anvilwalrusden.com>; ua-discuss@icann.org Subject: Re: [UA-discuss] Mixing between RTL and LTR scripts I agree that there is a place for the UASG to put forth a Good Practice guide that is intended to be read by implementers of limited (or greater) experience. I suggest that an Informational RFC is a good idea that covers the depth of content that Andrew alluded to. Having the two complementary documents will keep us from creating a single document that fails to adequately address the variety of audiences that we target. --Rich Richard Merdinger VP, Domains rmerdinger@godaddy.com On 5/10/18, 6:52 AM, "UA-discuss on behalf of Andrew Sullivan" <ua-discuss-bounces@icann.org on behalf of ajs@anvilwalrusden.com> wrote: On Thu, May 10, 2018 at 10:44:18AM +0000, Don Hollander wrote: > > I’m not sure where this should be documented. In the Unicode Consortium? I > don’t think the IETF, but I may be wrong. Does it fit within the W3C? It doesn't fit in any of these. The IDNA RFCs already say that you need to be careful about this, but they note, quite correctly, that there is literally no way to make a rule about entire domain names (as we have noted in this thread too). It would be possible to publish an RFC, if people really wanted to, restating that advice more strongly. But it's still at best going to be advice. And unfortunately, some domain names (ones that happen probably not to be aimed at humans) are _guaranteed_ to have bidi problems across the whole name in U-label form, because SRV and such like labels are going to have ASCII components because of the identifiers of protocols (like _tcp). > Would it be useful if the UASG published a Good Practice guide to BiDi in > Domain Names and Email Addresses? (I think, Raed, that you’ve done the work) > We could then update our existing documents to reference it. Or, if there’s > someone else who’s got a good guide, we could reference that instead of > building it afresh. > I think that would be a fine thing to do, but I think that you probably need someone with serious depth in the DNS and email protocols to help with it in order to make sure it doesn't run afoul of the details of the protocols. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
participants (13)
-
Abdalmonem Tharwat Galila -
Andre Schappo -
Andrew Sullivan -
Don Hollander -
Edmon -
John Levine -
Paul Stahura -
Peter Saint-Andre -
Raed AlFayez -
Richard Merdinger -
Roberto Gaetano -
Vittorio Bertola -
Zied BOUZIRI