RTL Script Domain Names & Email Addresses
I have given some thought to how to handle/display domain names that are in a right to left script such as Arabic. After some thought and experimentation I think that each label of a domain name should be bidi isolated. Lets take a 2 level domain name: 2ndLevelD.TLD where each label is written in a Right to Left Script. In a Left to Right paragraph/document I want the ordering to be 2ndLevelD.TLD In a Right to Left paragraph/document I want the ordering to be TLD.2ndLevelD Using bidi isolation this can be achieved. In the case of html5 one can use the bidi isolate tag <bdi> So, I now write my domain name in html5 as <bdi>2ndLevelD</bdi>.<bdi>TLD</bdi> I then experimented with the html dir attribute and flipped the whole document between dir="ltr" and dir="rtl" and it gave me the results I wanted as above. I need to think a bit more about the possible permutations of the local part of an email address but if it is a single word then the following works eg <bdi>email</bdi>@<bdi>2ndLevelD</bdi>.<bdi>TLD</bdi> I believe that using bidi isolates leads to easier to comprehend domain names/email addresses for both Left to Right readers and Right to Left readers. So, for example, a Left to Right reader would see a domain name 2ndLevelD.TLD and one can easily comprehend the ordering is from minor to major from Left to Right. The text in each label is though Right to Left André Schappo
On Tue, Mar 08, 2016 at 06:02:12PM +0000, Andre Schappo wrote:
I have given some thought to how to handle/display domain names that are in a right to left script such as Arabic. After some thought and experimentation I think that each label of a domain name should be bidi isolated.
That's already the requirement for processing, according to IDNA2008 (that is, IDNA is label by label, not over the domain name). So so far, I agree. I don't have an intuition as to what would be clear to a user; perhaps you have one. But …
Lets take a 2 level domain name: 2ndLevelD.TLD where each label is written in a Right to Left Script.
… I'm slightly worried about this, because I suspect you're also going to need something that is unsurprising where the different labels are in different directions.
I believe that using bidi isolates leads to easier to comprehend domain names/email addresses for both Left to Right readers and Right to Left readers.
This is also probably true, but since an awful lot of cases don't include the markup to indicate the bidi isolates, there's still that problem too. This includes (just for instance) email, particularly the headers. So while the mark up helps I'm wondering what to do about other cases. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
Without bidi isolates or any other form of intervention the email address مارك@رسيل.السعودية wholly displays L⬅︎R I have just created a temporary page to show how the arabic address مارك@رسيل.السعودية appears in L➡︎R Context & L⬅︎R Context using bidi isolates. https://co-project.lboro.ac.uk/users/coas/arabic-email.html With reference to my page : The L➡︎R Context gives an email address in a structure more familiar to a L➡︎R reader. The structure being, in this case, local-part@label.label. It does not matter which direction a label text is. Each label could have a different direction and again it would not matter. A L➡︎R reader would readily recognise this as an email address. Personally I find it much easier to recognise an email address when written like this. Similarly for the L⬅︎R Context I have used html to illustrate. In a word processing document one would use the Unicode isolates U+2066➡︎2069. It would be the responsibility of the software to automatically implement/insert/apply the bidi isolation for display of the email address to users. There are indeed other cases but I think the same principles outlined above can be applied. André Schappo
On 8 Mar 2016, at 18:51, Andrew Sullivan <ajs@anvilwalrusden.com> wrote:
On Tue, Mar 08, 2016 at 06:02:12PM +0000, Andre Schappo wrote:
I have given some thought to how to handle/display domain names that are in a right to left script such as Arabic. After some thought and experimentation I think that each label of a domain name should be bidi isolated.
That's already the requirement for processing, according to IDNA2008 (that is, IDNA is label by label, not over the domain name). So so far, I agree.
I don't have an intuition as to what would be clear to a user; perhaps you have one.
But …
Lets take a 2 level domain name: 2ndLevelD.TLD where each label is written in a Right to Left Script.
… I'm slightly worried about this, because I suspect you're also going to need something that is unsurprising where the different labels are in different directions.
I believe that using bidi isolates leads to easier to comprehend domain names/email addresses for both Left to Right readers and Right to Left readers.
This is also probably true, but since an awful lot of cases don't include the markup to indicate the bidi isolates, there's still that problem too. This includes (just for instance) email, particularly the headers. So while the mark up helps I'm wondering what to do about other cases.
Best regards,
A
-- Andrew Sullivan ajs@anvilwalrusden.com
猴猴猴猴猴猴猴猴猴猴猴猴猴猴 http://twitter.com/andreschappo http://schappo.blogspot.co.uk http://weibo.com/andreschappo
I decided to write a brief blog article about arabic email addresses and here it is http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html André Schappo On 9 Mar 2016, at 16:24, Andre Schappo <A.Schappo@lboro.ac.uk<mailto:A.Schappo@lboro.ac.uk>> wrote: Without bidi isolates or any other form of intervention the email address مارك@رسيل.السعودية wholly displays L⬅︎R I have just created a temporary page to show how the arabic address مارك@رسيل.السعودية appears in L➡︎R Context & L⬅︎R Context using bidi isolates. https://co-project.lboro.ac.uk/users/coas/arabic-email.html With reference to my page : The L➡︎R Context gives an email address in a structure more familiar to a L➡︎R reader. The structure being, in this case, local-part@label.label<mailto:local-part@label.label>. It does not matter which direction a label text is. Each label could have a different direction and again it would not matter. A L➡︎R reader would readily recognise this as an email address. Personally I find it much easier to recognise an email address when written like this. Similarly for the L⬅︎R Context I have used html to illustrate. In a word processing document one would use the Unicode isolates U+2066➡︎2069. It would be the responsibility of the software to automatically implement/insert/apply the bidi isolation for display of the email address to users. There are indeed other cases but I think the same principles outlined above can be applied. André Schappo On 8 Mar 2016, at 18:51, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: On Tue, Mar 08, 2016 at 06:02:12PM +0000, Andre Schappo wrote: I have given some thought to how to handle/display domain names that are in a right to left script such as Arabic. After some thought and experimentation I think that each label of a domain name should be bidi isolated. That's already the requirement for processing, according to IDNA2008 (that is, IDNA is label by label, not over the domain name). So so far, I agree. I don't have an intuition as to what would be clear to a user; perhaps you have one. But … Lets take a 2 level domain name: 2ndLevelD.TLD where each label is written in a Right to Left Script. … I'm slightly worried about this, because I suspect you're also going to need something that is unsurprising where the different labels are in different directions. I believe that using bidi isolates leads to easier to comprehend domain names/email addresses for both Left to Right readers and Right to Left readers. This is also probably true, but since an awful lot of cases don't include the markup to indicate the bidi isolates, there's still that problem too. This includes (just for instance) email, particularly the headers. So while the mark up helps I'm wondering what to do about other cases. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com> 猴猴猴猴猴猴猴猴猴猴猴猴猴猴 http://twitter.com/andreschappo http://schappo.blogspot.co.uk http://weibo.com/andreschappo 猴猴猴猴猴猴猴猴猴猴猴猴猴猴 http://twitter.com/andreschappo http://schappo.blogspot.co.uk http://weibo.com/andreschappo
Nicely done. It’s a good and informative read. --Rich From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andre Schappo Sent: March 13, 2016 11:38 AM To: ua-discuss@icann.org Subject: Re: [UA-discuss] RTL Script Domain Names & Email Addresses I decided to write a brief blog article about arabic email addresses and here it is http://schappo.blogspot.co.uk/2016/03/arabic-email-addresses.html André Schappo On 9 Mar 2016, at 16:24, Andre Schappo <A.Schappo@lboro.ac.uk<mailto:A.Schappo@lboro.ac.uk>> wrote: Without bidi isolates or any other form of intervention the email address مارك@رسيل.السعودية<mailto:مارك@رسيل.السعودية> wholly displays L⬅︎R I have just created a temporary page to show how the arabic address مارك@رسيل.السعودية<mailto:مارك@رسيل.السعودية> appears in L➡︎R Context & L⬅︎R Context using bidi isolates. https://co-project.lboro.ac.uk/users/coas/arabic-email.html With reference to my page : The L➡︎R Context gives an email address in a structure more familiar to a L➡︎R reader. The structure being, in this case, local-part@label.label<mailto:local-part@label.label>. It does not matter which direction a label text is. Each label could have a different direction and again it would not matter. A L➡︎R reader would readily recognise this as an email address. Personally I find it much easier to recognise an email address when written like this. Similarly for the L⬅︎R Context I have used html to illustrate. In a word processing document one would use the Unicode isolates U+2066➡︎2069. It would be the responsibility of the software to automatically implement/insert/apply the bidi isolation for display of the email address to users. There are indeed other cases but I think the same principles outlined above can be applied. André Schappo On 8 Mar 2016, at 18:51, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote: On Tue, Mar 08, 2016 at 06:02:12PM +0000, Andre Schappo wrote: I have given some thought to how to handle/display domain names that are in a right to left script such as Arabic. After some thought and experimentation I think that each label of a domain name should be bidi isolated. That's already the requirement for processing, according to IDNA2008 (that is, IDNA is label by label, not over the domain name). So so far, I agree. I don't have an intuition as to what would be clear to a user; perhaps you have one. But … Lets take a 2 level domain name: 2ndLevelD.TLD where each label is written in a Right to Left Script. … I'm slightly worried about this, because I suspect you're also going to need something that is unsurprising where the different labels are in different directions. I believe that using bidi isolates leads to easier to comprehend domain names/email addresses for both Left to Right readers and Right to Left readers. This is also probably true, but since an awful lot of cases don't include the markup to indicate the bidi isolates, there's still that problem too. This includes (just for instance) email, particularly the headers. So while the mark up helps I'm wondering what to do about other cases. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com> 猴猴猴猴猴猴猴猴猴猴猴猴猴猴 http://twitter.com/andreschappo http://schappo.blogspot.co.uk http://weibo.com/andreschappo 猴猴猴猴猴猴猴猴猴猴猴猴猴猴 http://twitter.com/andreschappo http://schappo.blogspot.co.uk http://weibo.com/andreschappo
participants (3)
-
Andre Schappo -
Andrew Sullivan -
Richard Merdinger