More from Ram Mohan on ICANN's further commitment to Universal Acceptance
https://uasg.tech/2018/12/icann-further-commits-to-universal-acceptance-of-d... Don Hollander Secretary General - UASG Skype: Don_Hollander
It is a very good news. " ICANN��s Board has made improving and promoting UA and Internationalized Domain Name (IDN) implementation one of five strategic priorities for FY21-FY25. Specifically, its strategic plan notes: ��The rapid evolution of new technologies requires ICANN to be responsive to these changes and ensure that the unique identifiers system evolves and continues to serve the global Internet user base.�� To help achieve this objective, ICANN is adding UA to the responsibilities of its Board IDN/UA Working Group. " Thanks Ram and Don for pushing and sharing it. Jiankang Yao From: Don Hollander Date: 2018-12-20 17:09 To: ua-discuss@icann.org Subject: [UA-discuss] More from Ram Mohan on ICANN's further commitment to Universal Acceptance https://uasg.tech/2018/12/icann-further-commits-to-universal-acceptance-of-d... Don Hollander Secretary General �C UASG Skype: Don_Hollander
Will ICANNʼs further commitment to UA extend to ICANN having their own set of IDNs and EAI addresses? This is a topic myself and others have raised over the years but there always seems to be reluctance on the part of ICANN to having their own IDNs and EAI addresses. I have never understood this reluctance/resistance. André Schappo On 20 Dec 2018, at 09:09, don hollander <don.hollander@icann.org<mailto:don.hollander@icann.org>> wrote: https://uasg.tech/2018/12/icann-further-commits-to-universal-acceptance-of-d...
In article <6963BB0C-4172-4EBD-9C38-56DEDFECF45E@lboro.ac.uk> you write:
-=-=-=-=-=- Will ICANNʼs further commitment to UA extend to ICANN having their own set of IDNs and EAI addresses? ...
I have never understood this reluctance/resistance.
I do. Nobody understands what it means for two different domain names to be "the same", or for two web sites to be "the same", or even for the same web site addressed by different URLs to be "the same." Do all the names redirect to the original URL, or do you have a complete version of the web site at every IDN address? How do you try to keep bookmarks straight if there are four URLs for every page? For a site like ICANN's that has multiple language versions of many pages, do you try to make the language in the page match the language of the URL, or do you prefer the Accept: languages from the user's web browser, or what? It is relatively straightforward to have a Chinese web site at a single Chinese IDN domain, but nobody has a clue how to parallel versions of web sites at different names. I've done some informal surveys of names that are supposed to be "the same", even in the same language, with names in .cat when they were doing DNAMEs, and names in .ngo and .ong, and I found that hardly anyone even tries to make it work. For EAI, as my TLD survey showed last month, only about 10% of mail systems for gTLDs can handle EAI mail, with a large fraction of those being hosted at Gmail and Outlook. From a UA point of view, if you want to communicate with people, it is a good idea to handle other people's EAI addresses, but a poor idea to assign EAI addresses and expect other mail systems to handle them correctly at this point.* There is a further swamp with addresses used as identifiers, as does approximately every web site in the world that has user accounts. We have informal conventions for ASCII addresses: upper and lower case are equivalent, and addresses with exotic punctuation don't work very well. We have nothing like that for EAI, and in view of the vast number of different and incompatible conventions in different languages, it is a very hard problem. I offred a little informal advice on address assignment in the EAI guide I wrote, and the IETF may do some work in this area next year, but we are a long way from solving or even understanding it. Finally, I am definiteky NOT suggesting that UASG should try to solve any of these problems. We need to be aware of them and perhaps warn people away from situations that might cause more UA issues. We barely have the time and expertise to do what we're already trying to do. Let us avoid mission creep here. R's, John * - This may be different among communities that all speak the same non-Latin language, perhaps in parts of India, but it's true in general.
Thanks for this write-up, John. I think you stated the complex issues quite succinctly and well. I agree also that it is not us to UASG to attempt to provide solutions, but that it is important for us to acknowledge the issues and help them gain visibility so the right groups can drive to some viable resolutions. --Rich -----Original Message----- From: UA-discuss <ua-discuss-bounces@icann.org> On Behalf Of John Levine Sent: December 24, 2018 12:01 PM To: ua-discuss@icann.org Subject: Re: [UA-discuss] More from Ram Mohan on ICANN's further commitment to Universal Acceptance In article <6963BB0C-4172-4EBD-9C38-56DEDFECF45E@lboro.ac.uk> you write:
-=-=-=-=-=- Will ICANNʼs further commitment to UA extend to ICANN having their own set of IDNs and EAI addresses? ...
I have never understood this reluctance/resistance.
I do. Nobody understands what it means for two different domain names to be "the same", or for two web sites to be "the same", or even for the same web site addressed by different URLs to be "the same." Do all the names redirect to the original URL, or do you have a complete version of the web site at every IDN address? How do you try to keep bookmarks straight if there are four URLs for every page? For a site like ICANN's that has multiple language versions of many pages, do you try to make the language in the page match the language of the URL, or do you prefer the Accept: languages from the user's web browser, or what? It is relatively straightforward to have a Chinese web site at a single Chinese IDN domain, but nobody has a clue how to parallel versions of web sites at different names. I've done some informal surveys of names that are supposed to be "the same", even in the same language, with names in .cat when they were doing DNAMEs, and names in .ngo and .ong, and I found that hardly anyone even tries to make it work. For EAI, as my TLD survey showed last month, only about 10% of mail systems for gTLDs can handle EAI mail, with a large fraction of those being hosted at Gmail and Outlook. From a UA point of view, if you want to communicate with people, it is a good idea to handle other people's EAI addresses, but a poor idea to assign EAI addresses and expect other mail systems to handle them correctly at this point.* There is a further swamp with addresses used as identifiers, as does approximately every web site in the world that has user accounts. We have informal conventions for ASCII addresses: upper and lower case are equivalent, and addresses with exotic punctuation don't work very well. We have nothing like that for EAI, and in view of the vast number of different and incompatible conventions in different languages, it is a very hard problem. I offred a little informal advice on address assignment in the EAI guide I wrote, and the IETF may do some work in this area next year, but we are a long way from solving or even understanding it. Finally, I am definiteky NOT suggesting that UASG should try to solve any of these problems. We need to be aware of them and perhaps warn people away from situations that might cause more UA issues. We barely have the time and expertise to do what we're already trying to do. Let us avoid mission creep here. R's, John * - This may be different among communities that all speak the same non-Latin language, perhaps in parts of India, but it's true in general.
John, Much of what you mention seems to already be an issue, independent of IDNs. What about "www." being an optional subdomain? How are the techniques used to handle this different from having an IDN alias? Yes, I did note the passage on language negotiation, but how is that different from sites that can be accessed via ccTLDs in addition to a domain name in a gTLD. That's a pattern typical for many global organizations. In all of these examples, the FQDN to access the sites is not unique; if your organization had a local name, I don't know why you wouldn't be using it (perhaps in addition to the global name) when accessed via that ccTLD. A bigger obstacle seems to be that the ASCII-name *is* the brand, and that it is deliberately kept untranslated (e.g. "SONY"). If you do split your site into local offerings, then you need to have some method for users to access the desired local platform, independent of where they happen to be located - that's something you'll find on many airline sites, whose users not unexpectedly need to access the site when not at home. How are any of these issues materially different from offering your site with multiple localized names? A./ PS: some non-European scripts can have variants that work similar to case-equivalence. If you want to institute loose matching of e-mail usernames based on them, you'd have to roll your own - just as ASCII user names get compared with certain punctuation ignored. Source data that you can harvest for this purpose exists in many IANA IDN tables ("allocatable variants") and will also exist as part of the Root Zone LGR project. (Note, unlike case-equivalence for cased scripts, most scripts have variants for only a subset of characters, and some may have variants for only a few specific cases. But that doesn't make the process of loose matching any different). On 12/24/2018 10:39 AM, Richard Merdinger wrote:
Thanks for this write-up, John. I think you stated the complex issues quite succinctly and well. I agree also that it is not us to UASG to attempt to provide solutions, but that it is important for us to acknowledge the issues and help them gain visibility so the right groups can drive to some viable resolutions.
--Rich
-----Original Message----- From: UA-discuss <ua-discuss-bounces@icann.org> On Behalf Of John Levine Sent: December 24, 2018 12:01 PM To: ua-discuss@icann.org Subject: Re: [UA-discuss] More from Ram Mohan on ICANN's further commitment to Universal Acceptance
In article <6963BB0C-4172-4EBD-9C38-56DEDFECF45E@lboro.ac.uk> you write:
-=-=-=-=-=- Will ICANNʼs further commitment to UA extend to ICANN having their own set of IDNs and EAI addresses? ... I have never understood this reluctance/resistance. I do.
Nobody understands what it means for two different domain names to be "the same", or for two web sites to be "the same", or even for the same web site addressed by different URLs to be "the same." Do all the names redirect to the original URL, or do you have a complete version of the web site at every IDN address? How do you try to keep bookmarks straight if there are four URLs for every page? For a site like ICANN's that has multiple language versions of many pages, do you try to make the language in the page match the language of the URL, or do you prefer the Accept: languages from the user's web browser, or what?
It is relatively straightforward to have a Chinese web site at a single Chinese IDN domain, but nobody has a clue how to parallel versions of web sites at different names. I've done some informal surveys of names that are supposed to be "the same", even in the same language, with names in .cat when they were doing DNAMEs, and names in .ngo and .ong, and I found that hardly anyone even tries to make it work.
For EAI, as my TLD survey showed last month, only about 10% of mail systems for gTLDs can handle EAI mail, with a large fraction of those being hosted at Gmail and Outlook. From a UA point of view, if you want to communicate with people, it is a good idea to handle other people's EAI addresses, but a poor idea to assign EAI addresses and expect other mail systems to handle them correctly at this point.*
There is a further swamp with addresses used as identifiers, as does approximately every web site in the world that has user accounts. We have informal conventions for ASCII addresses: upper and lower case are equivalent, and addresses with exotic punctuation don't work very well. We have nothing like that for EAI, and in view of the vast number of different and incompatible conventions in different languages, it is a very hard problem. I offred a little informal advice on address assignment in the EAI guide I wrote, and the IETF may do some work in this area next year, but we are a long way from solving or even understanding it.
Finally, I am definiteky NOT suggesting that UASG should try to solve any of these problems. We need to be aware of them and perhaps warn people away from situations that might cause more UA issues. We barely have the time and expertise to do what we're already trying to do. Let us avoid mission creep here.
R's, John
* - This may be different among communities that all speak the same non-Latin language, perhaps in parts of India, but it's true in general.
In article <2eb428e5-ed29-a914-23e3-7889b427b69d@ix.netcom.com> you write:
What about "www." being an optional subdomain?
How are the techniques used to handle this different from having an IDN alias?
I think it's pretty safe to assume that foo.com and www.foo.com are in the same language, and if one redirects to the other, nobody will be confused. Even so, getting it to work right is not totally trivial. The two names need their own SSL certificates, or if there's one cert it has to be validated for both names. If the site uses cookies as most do to manage site logins or user options, it has to be sure the cookies for the two names are kept in sync or all forced to one of the names. None of this is terribly hard, but it's not automatic either.
Yes, I did note the passage on language negotiation, but how is that different from sites that can be accessed via ccTLDs in addition to a domain name in a gTLD. That's a pattern typical for many global organizations.
Same answer, except that if one name isn't a subdomain of the other, the login and option cookie problems are a lot harder.
How are any of these issues materially different from offering your site with multiple localized names?
The point, which I apparently wrongly thought was obvious, is that none of this multi-name stuff works automatically, and telling people "just add a bunch of IDN names and EAI addresses" is not going to end well. R's, John PS:
PS: some non-European scripts can have variants that work similar to case-equivalence. If you want to institute loose matching of e-mail usernames based on them, you'd have to roll your own -
Yes, people who are working about EAI are aware of the way that local part matching works. Since every mail system already has its own loose matching rules, it's not a new problem but it's not one that anyone has thought much about for EAI mail addresses. I can definitely tell you that without loose address matching that matches user expectations, whatever they are, your customers will hate you and decide that your system is unusable.
Although this thread addresses the structural difficulties of multiple domains serving one organization, the other issue is the difficulty of users verifying that these domains belong to the same owner. Yes, it is already an issue today with for example mcdonalds wanting to own mcdonalds in each tld, ibut t becomes that much more difficult to have domains in different scripts, etc. and for users to know which ones are legitimate and which are spoofs. It seems safer for an organization to have a singular domain than to add a few and open the door to imitators that may be hard to find and close down before they do damage to their market. Not to mention there often isn’t a single correct transliteration or translation.... So is this thread really about: Having IDNs make sense as singular domains, but when does it make sense or not make sense to have multiple domains (IDN or otherwise)? Tex -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of John Levine Sent: Tuesday, December 25, 2018 1:58 PM To: ua-discuss@icann.org Subject: Re: [UA-discuss] More from Ram Mohan on ICANN's further commitment to Universal Acceptance In article <2eb428e5-ed29-a914-23e3-7889b427b69d@ix.netcom.com> you write:
What about "www." being an optional subdomain?
How are the techniques used to handle this different from having an IDN alias?
I think it's pretty safe to assume that foo.com and www.foo.com are in the same language, and if one redirects to the other, nobody will be confused. Even so, getting it to work right is not totally trivial. The two names need their own SSL certificates, or if there's one cert it has to be validated for both names. If the site uses cookies as most do to manage site logins or user options, it has to be sure the cookies for the two names are kept in sync or all forced to one of the names. None of this is terribly hard, but it's not automatic either.
Yes, I did note the passage on language negotiation, but how is that different from sites that can be accessed via ccTLDs in addition to a domain name in a gTLD. That's a pattern typical for many global organizations.
Same answer, except that if one name isn't a subdomain of the other, the login and option cookie problems are a lot harder.
How are any of these issues materially different from offering your site with multiple localized names?
The point, which I apparently wrongly thought was obvious, is that none of this multi-name stuff works automatically, and telling people "just add a bunch of IDN names and EAI addresses" is not going to end well. R's, John PS:
PS: some non-European scripts can have variants that work similar to case-equivalence. If you want to institute loose matching of e-mail usernames based on them, you'd have to roll your own -
Yes, people who are working about EAI are aware of the way that local part matching works. Since every mail system already has its own loose matching rules, it's not a new problem but it's not one that anyone has thought much about for EAI mail addresses. I can definitely tell you that without loose address matching that matches user expectations, whatever they are, your customers will hate you and decide that your system is unusable.
On 26 Dec 2018, at 06:22, Tex <textexin@xencraft.com<mailto:textexin@xencraft.com>> wrote: Although this thread addresses the structural difficulties of multiple domains serving one organization, the other issue is the difficulty of users verifying that these domains belong to the same owner. Yes, it is already an issue today with for example mcdonalds wanting to own mcdonalds in each tld, ibut t becomes that much more difficult to have domains in different scripts, etc. and for users to know which ones are legitimate and which are spoofs. It seems safer for an organization to have a singular domain than to add a few and open the door to imitators that may be hard to find and close down before they do damage to their market. Not to mention there often isn’t a single correct transliteration or translation.... So is this thread really about: Having IDNs make sense as singular domains, but when does it make sense or not make sense to have multiple domains (IDN or otherwise)? Tex Many years ago I envisaged that, where applicable, websites would have both an ASCII Domain Name and a localised IDN. eg a chinese website could have 双好.中国 and double-happy.cn<http://double-happy.cn> At the time I regarded the ASCII essential because of the lack of support for IDNs. Nowadays, I am more inclined to regard the ASCII as legacy and secondary to the primary Chinese domain name. I consider the general principle of having an IDN and an ASCII still holds for many websites. Then there are the websites that are global such as the Olympics website which I consider should have a set of IDNs. A set that encompasses the official primary human language scripts of the countries competing in that yearʼs olympic games. My primary reason for wanting ICANN to have a set of IDNs is so ICANN leads by example. André Schappo -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of John Levine Sent: Tuesday, December 25, 2018 1:58 PM To: ua-discuss@icann.org<mailto:ua-discuss@icann.org> Subject: Re: [UA-discuss] More from Ram Mohan on ICANN's further commitment to Universal Acceptance In article <2eb428e5-ed29-a914-23e3-7889b427b69d@ix.netcom.com<mailto:2eb428e5-ed29-a914-23e3-7889b427b69d@ix.netcom.com>> you write: What about "www." being an optional subdomain? How are the techniques used to handle this different from having an IDN alias? I think it's pretty safe to assume that foo.com<http://foo.com> and www.foo.com<http://www.foo.com> are in the same language, and if one redirects to the other, nobody will be confused. Even so, getting it to work right is not totally trivial. The two names need their own SSL certificates, or if there's one cert it has to be validated for both names. If the site uses cookies as most do to manage site logins or user options, it has to be sure the cookies for the two names are kept in sync or all forced to one of the names. None of this is terribly hard, but it's not automatic either. Yes, I did note the passage on language negotiation, but how is that different from sites that can be accessed via ccTLDs in addition to a domain name in a gTLD. That's a pattern typical for many global organizations. Same answer, except that if one name isn't a subdomain of the other, the login and option cookie problems are a lot harder. How are any of these issues materially different from offering your site with multiple localized names? The point, which I apparently wrongly thought was obvious, is that none of this multi-name stuff works automatically, and telling people "just add a bunch of IDN names and EAI addresses" is not going to end well. R's, John PS: PS: some non-European scripts can have variants that work similar to case-equivalence. If you want to institute loose matching of e-mail usernames based on them, you'd have to roll your own - Yes, people who are working about EAI are aware of the way that local part matching works. Since every mail system already has its own loose matching rules, it's not a new problem but it's not one that anyone has thought much about for EAI mail addresses. I can definitely tell you that without loose address matching that matches user expectations, whatever they are, your customers will hate you and decide that your system is unusable. 🌏 🌍 🌎 André Schappo 小山@电邮.在线?Subject=你好小山😜<mailto:%E5%B0%8F%E5%B1%B1@%E7%94%B5%E9%82%AE.%E5%9C%A8%E7%BA%BF?Subject=%E4%BD%A0%E5%A5%BD%E5%B0%8F%E5%B1%B1%F0%9F%98%9C> schappo.blogspot.co.uk<https://schappo.blogspot.co.uk> twitter.com/andreschappo<https://twitter.com/andreschappo> weibo.com/andreschappo?is_all=1<https://weibo.com/andreschappo?is_all=1> groups.google.com/forum/#!forum/computer-science-curriculum-internationalization<https://groups.google.com/forum/#!forum/computer-science-curriculum-internat...>
On Tue, 25 Dec 2018, Tex wrote:
Although this thread addresses the structural difficulties of multiple domains serving one organization, the other issue is the difficulty of users verifying that these domains belong to the same owner.
This is another very hard problem, one that the IETF tried and failed to address in the DBOUND working group.
So is this thread really about: Having IDNs make sense as singular domains, but when does it make sense or not make sense to have multiple domains (IDN or otherwise)?
Chanelling Asmus to some degree, identifying the same owner is the same problem regardless of what scripts the domain names might be in. The best one can do now is green bar TLS certificates. Again, this is something that we might note as an issue, not something we should or can try to fix. Regards, John Levine, john.levine@standcore.com Standcore LLC
Thanks for the reference to DBOUND. I was unaware of that group. I can understand that this group being evangelists is not the right team to solve these problems. However, noting this is an issue seems also to be less than adequate. As we advocate for EAI and IDNA we need to point out the cons as well as the pros. If the cons are substantial obstacles, we should be advocating for solutions or workarounds, not just noting them. When we promote an IDN and note that by the way this opens the door to multiple malicious IDNs that your users cannot determine belong to you nor whether your IDN is legitimate, any arguments as to the benefits falls away. We should advocate for lowering barriers, not just promoting usage. We should also be advocating safe utilization, not just utilization. In addition to identifying systems that lack support and advocating remediation, we should perhaps have a document identifying other limitations and be seeking their remediation as well. Tex -----Original Message----- From: John Levine [mailto:john.levine@standcore.com] Sent: Wednesday, December 26, 2018 3:03 PM To: Tex Cc: ua-discuss@icann.org Subject: RE: [UA-discuss] More from Ram Mohan on ICANN's further commitment to Universal Acceptance On Tue, 25 Dec 2018, Tex wrote:
Although this thread addresses the structural difficulties of multiple domains serving one organization, the other issue is the difficulty of users verifying that these domains belong to the same owner.
This is another very hard problem, one that the IETF tried and failed to address in the DBOUND working group.
So is this thread really about: Having IDNs make sense as singular domains, but when does it make sense or not make sense to have multiple domains (IDN or otherwise)?
Chanelling Asmus to some degree, identifying the same owner is the same problem regardless of what scripts the domain names might be in. The best one can do now is green bar TLS certificates. Again, this is something that we might note as an issue, not something we should or can try to fix. Regards, John Levine, john.levine@standcore.com Standcore LLC
As we advocate for EAI and IDNA we need to point out the cons as well as the pros. If the cons are substantial obstacles, we should be advocating for solutions or workarounds, not just noting them.
I agree that we should point out the problems, but given that every attempt in the past decade to solve the problem of identifying domain and web sites as "the same" has failed, what else should this group with its limited time and technical resources do? I'm reasonably sure that for nearly any approach anyone here could suggest, I can tell you who's tried it and why it hasn't worked. For the ones I'd miss, Andrew Sullivan can fill in the gaps. We really need to concentrate on the small set of issues we understand, and firmly resist mission creep, or in some cases, mission gallop. Regards, John Levine, john.levine@standcore.com Standcore LLC PS:
When we promote an IDN and note that by the way this opens the door to multiple malicious IDNs that your users cannot determine belong to you nor whether your IDN is legitimate, any arguments as to the benefits falls away.
Yes indeed. Sure hope those green bar certs are good enough.
On 12/25/2018 1:58 PM, John Levine wrote:
In article <2eb428e5-ed29-a914-23e3-7889b427b69d@ix.netcom.com> you write:
What about "www." being an optional subdomain?
How are the techniques used to handle this different from having an IDN alias? I think it's pretty safe to assume that foo.com and www.foo.com are in the same language, and if one redirects to the other, nobody will be confused. Even so, getting it to work right is not totally trivial.
I agree on both counts. But nevertheless, it is commonly done. I think the "language" issue is a bit of a red herring. When traveling, things like google searches, weather forecasts and many other services are re-routed to the local service who then impose their local language (and metric) preferences on me by default. All without IDNs. So while "confusion" in the sense of not getting what you expect is an issue (I mean it is a bit silly to get a foreign weather forecast delivered in Fahrenheit as long as I access it from the US, but refreshing the same site after arriving at my destination will switch the display to Celsius) it's not one that's dependent on IDNs.
The two names need their own SSL certificates, or if there's one cert it has to be validated for both names. If the site uses cookies as most do to manage site logins or user options, it has to be sure the cookies for the two names are kept in sync or all forced to one of the names.
None of this is terribly hard, but it's not automatic either.
The point about it not being automatic is worth making.
Yes, I did note the passage on language negotiation, but how is that different from sites that can be accessed via ccTLDs in addition to a domain name in a gTLD. That's a pattern typical for many global organizations. Same answer, except that if one name isn't a subdomain of the other, the login and option cookie problems are a lot harder.
Airline sites that direct to local access (with domain in local ccTLD) would have that issue and appear to be able to handle it. Other services do as well - not always without some bumps.
How are any of these issues materially different from offering your site with multiple localized names? The point, which I apparently wrongly thought was obvious, is that none of this multi-name stuff works automatically, and telling people "just add a bunch of IDN names and EAI addresses" is not going to end well.
I think that "aliasing" doesn't work automatically is a point worth making. IDNs add another level of opportunity for aliasing, but I don't think they really add anything new that existing forms of aliasing aren't exposing you to. The main difference is that crossing script boundaries makes it impossible for users not native (or competent) in both scripts to relate your aliased names. (Within a script my suspicion is that you wouldn't normally translate domain names without leaving at least a recognizable part, like a language-neutral brand name or abbreviation). That, however, is a cognitive issue, in the sense that if you present the wrong user with an "opaque" name you may or may not create an issue. It's similar, but not the same as presenting them with the "wrong" language version of your site.
R's, John
PS:
PS: some non-European scripts can have variants that work similar to case-equivalence. If you want to institute loose matching of e-mail usernames based on them, you'd have to roll your own - Yes, people who are working about EAI are aware of the way that local part matching works. Since every mail system already has its own loose matching rules, it's not a new problem but it's not one that anyone has thought much about for EAI mail addresses. I can definitely tell you that without loose address matching that matches user expectations, whatever they are, your customers will hate you and decide that your system is unusable.
Totally. My point was intended to be helpful in pointing out where you might find data to extend loose matching.
On Tue, Dec 25, 2018 at 10:22:55PM -0800, Asmus Freytag (c) wrote:
When traveling, things like google searches, weather forecasts and many other services are re-routed to the local service who then impose their local language (and metric) preferences on me by default. All without IDNs.
That is a _completely unrelated_ issue, and I really think we're going to get into trouble if even we cannot pay attention to these distinctions. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
I think the "language" issue is a bit of a red herring.
When traveling, things like google searches, weather forecasts and many other services are re-routed to the local service who then impose their local language (and metric) preferences on me by default. All without IDNs.
Indeed, but I would think that multiple IDN names for a web site would imply multiple languages. If they all just redirect to the English language site, that's technically easy, but it also seems like a cruel joke.
Same answer, except that if one name isn't a subdomain of the other, the login and option cookie problems are a lot harder.
Airline sites that direct to local access (with domain in local ccTLD) would have that issue and appear to be able to handle it. Other services do as well - not always without some bumps.
Having done this a few times, my experience is that they generally all redirect to the same site, and there's a button at the top to pick a language, which is usually remembered in a cookie, maybe initialized with a guess from the original name but usually not. Again, one could do that with multiple IDN names, but why bother?
The main difference is that crossing script boundaries makes it impossible for users not native (or competent) in both scripts to relate your aliased names. (Within a script my suspicion is that you wouldn't normally translate domain names without leaving at least a recognizable part, like a language-neutral brand name or abbreviation).
Seems reasonable.
definitely tell you that without loose address matching that matches user expectations, whatever they are, your customers will hate you and decide that your system is unusable.
Totally.
My point was intended to be helpful in pointing out where you might find data to extend loose matching.
I've been wondering if it's worth spinning up an IRTF group to try and collect advice on loose matching and (sort of its inverse) son-of-PRECIS assigning user names that allow characters that users expect, but that won't collide with variants or if non-speakers misenter them as homographs or near homographs. Regards, John Levine, john.levine@standcore.com Standcore LLC
Hello Friends. Looks like we have wonderful discussion with bit of digression from UASG stuff. I do not to need to mention UASG scope, as its already there on UASG.TECH website, however I think we have simple to understand UASG boundaries for atleast domain name. Boundary 1. All the domain names must be treated equally which includes all ASCII .tld and IDN .tld's and ccTLDs. Now the question is simple. When global companies like Amazon or Americanexpress do have 100+ domain names active for respective countries (not necessary local languages served) what stops them to use IDNs too. The argument of having single domain seems to be not valid and obviously NOT a UA issue. Having said that both the above companies are not UA ready and that's what we should be really concerned with. Now to address this UA issue with them there are many ways but one of the ways could be to push them for adoption of IDN and show them new customer base they are missing which can not reach to there website, as those who do not know English can not type English domain names. One can argue whether it's a UA issue or not but it is certainly going to activate internal UA initiative , as IDNs are adopted in the company. The similar example we have seen in patrika.bharat ( a media house) and rajasthan.bharat (govt). Boundary 2: We are supposed to create awareness about UA issues and help to solve them. We are spending millions of $s to create document and libraries. Here we are getting into "providing solution" and creating knowledge base. One can argue for these to be the best of the class library/solution or not however is available live on uasg.tech website. We are providing users multiple help to become UA ready. Great volunteering and paid professional wok has happened to produce superb documents. If there is No IDN adoption and no EAI visible to Amazon and Americanexpress than they are less likely to even touch there apps for UA and the fact is that globally we have very Little adoption of IDN and EAI. So the low hanging fruit could be that we push large companies to adopt IDN themselves and which pushes them internally to become UA ready themselves. So off course adopting IDN or EAI is not directly an UA issue but surely helps to push UA agenda within company. This falls in line of the activities where we create awareness and pushcompany to be UA ready. That's the ultimate goal. So I would support Andre argument for ICANN to adopt IDNs , walk the talk and become UA ready. Side Note: Voice search is increasing globally and people specific to there language are likely to search in there local language where IDN will play major role and ASCII domain name will be of little use. Thanks. AD On 27 December 2018 07:00:04 GMT+05:30, John Levine <john.levine@standcore.com> wrote:
I think the "language" issue is a bit of a red herring.
When traveling, things like google searches, weather forecasts and many other services are re-routed to the local service who then impose their local language (and metric) preferences on me by default. All without IDNs.
Indeed, but I would think that multiple IDN names for a web site would imply multiple languages. If they all just redirect to the English language site, that's technically easy, but it also seems like a cruel joke.
Same answer, except that if one name isn't a subdomain of the other, the login and option cookie problems are a lot harder.
Airline sites that direct to local access (with domain in local ccTLD) would have that issue and appear to be able to handle it. Other services do as well - not always without some bumps.
Having done this a few times, my experience is that they generally all redirect to the same site, and there's a button at the top to pick a language, which is usually remembered in a cookie, maybe initialized with a guess from the original name but usually not. Again, one could do that with multiple IDN names, but why bother?
The main difference is that crossing script boundaries makes it impossible for users not native (or competent) in both scripts to relate your aliased names. (Within a script my suspicion is that you wouldn't normally translate domain names without leaving at least a recognizable part, like a language-neutral brand name or abbreviation).
Seems reasonable.
definitely tell you that without loose address matching that matches user expectations, whatever they are, your customers will hate you and decide that your system is unusable.
Totally.
My point was intended to be helpful in pointing out where you might find data to extend loose matching.
I've been wondering if it's worth spinning up an IRTF group to try and collect advice on loose matching and (sort of its inverse) son-of-PRECIS assigning user names that allow characters that users expect, but that won't collide with variants or if non-speakers misenter them as homographs or near homographs.
Regards, John Levine, john.levine@standcore.com Standcore LLC
-- Sent from my Android device with XGenPlus.
Dr. Ajay Data wrote: [...]
So I would support Andre argument for ICANN to adopt IDNs , walk the talk and become UA ready.
This might be a good opportunity to remind people that we publish the status of our UA readiness in the Accountability Indicators. You can find them at: https://www.icann.org/accountability-indicators The Universal Acceptance Readiness chart is in Goal 3.2. We welcome your feedback on this and other charts in the Accountability Indicators. Kind regards, Leo Vegoda Operations Program Director, ICANN 12025 Waterfront Drive, Suite 300 | Los Angeles, CA 90094-2536 | USA
On 12/26/2018 5:30 PM, John Levine wrote:
definitely tell you that without loose address matching that matches user expectations, whatever they are, your customers will hate you and decide that your system is unusable.
Totally.
My point was intended to be helpful in pointing out where you might find data to extend loose matching.
I've been wondering if it's worth spinning up an IRTF
Typo for IETF, I take it.
group to try and collect advice on loose matching and (sort of its inverse) son-of-PRECIS assigning user names that allow characters that users expect, but that won't collide with variants or if non-speakers misenter them as homographs or near homographs.
Can you get enough bodies to join an IETF working group for 2-3 years? Without that, nothing will happen in the i18n arena. A./ PS: would be great if some issue like this could attract enough participants to deal with these kinds of issues.
On Wed, Dec 26, 2018 at 08:38:26PM -0800, Asmus Freytag (c) wrote:
On 12/26/2018 5:30 PM, John Levine wrote:
I've been wondering if it's worth spinning up an IRTF
Typo for IETF, I take it.
Nope. The IRTF is the Internet Research Task Force, which is a place that work that can't actually properly be considered for standardization can go. The real point (IYAM) of the IRTF is to get academic researchers to contribute vaguely at the IETF without having to geek out. A -- Andrew Sullivan ajs@anvilwalrusden.com
Nope. The IRTF is the Internet Research Task Force, which is a place that work that can't actually properly be considered for standardization can go. The real point (IYAM) of the IRTF is to get academic researchers to contribute vaguely at the IETF without having to geek out.
Right. It seems to me that finding out whether we understand any scripts well enough to offer advice on how to create good names and good fuzzy matches is considerably beyond what we currently know how to do, so the IRTF is where that sort of thing lives. IRTF groups are hit and miss. I ran one about spam which was a complete failure due to non-overlap of people who were willing to do work and people who understood the issues. Some RGs on exotic networking have done interesting work. Regards, John Levine, john.levine@standcore.com Standcore LLC
On 25 Dec 2018, at 21:58, John Levine <john.levine@standcore.com<mailto:john.levine@standcore.com>> wrote: In article <2eb428e5-ed29-a914-23e3-7889b427b69d@ix.netcom.com<mailto:2eb428e5-ed29-a914-23e3-7889b427b69d@ix.netcom.com>> you write: What about "www." being an optional subdomain? How are the techniques used to handle this different from having an IDN alias? I think it's pretty safe to assume that foo.com<http://foo.com> and www.foo.com<http://www.foo.com> are in the same language, and if one redirects to the other, nobody will be confused. Even so, getting it to work right is not totally trivial. The two names need their own SSL certificates, or if there's one cert it has to be validated for both names. If the site uses cookies as most do to manage site logins or user options, it has to be sure the cookies for the two names are kept in sync or all forced to one of the names. None of this is terribly hard, but it's not automatic either. Yes, I did note the passage on language negotiation, but how is that different from sites that can be accessed via ccTLDs in addition to a domain name in a gTLD. That's a pattern typical for many global organizations. Same answer, except that if one name isn't a subdomain of the other, the login and option cookie problems are a lot harder. How are any of these issues materially different from offering your site with multiple localized names? The point, which I apparently wrongly thought was obvious, is that none of this multi-name stuff works automatically, and telling people "just add a bunch of IDN names and EAI addresses" is not going to end well. There is one part that can happen automatically. If the web authors have used relative addressing throughout their website then the url will automatically inherit the Domain Name being used on entry. So if icann.org<http://icann.org> is used on entry then icann.org<http://icann.org> will show and stick in the browser address bar. If icann.संगठन is used on entry then icann.संगठन will show and stick in the browser address bar. Whether or not one then adapts website content according to which Domain Name is used is a different issue André Schappo R's, John PS: PS: some non-European scripts can have variants that work similar to case-equivalence. If you want to institute loose matching of e-mail usernames based on them, you'd have to roll your own - Yes, people who are working about EAI are aware of the way that local part matching works. Since every mail system already has its own loose matching rules, it's not a new problem but it's not one that anyone has thought much about for EAI mail addresses. I can definitely tell you that without loose address matching that matches user expectations, whatever they are, your customers will hate you and decide that your system is unusable. 🌏 🌍 🌎 André Schappo 小山@电邮.在线?Subject=你好小山😜<mailto:%E5%B0%8F%E5%B1%B1@%E7%94%B5%E9%82%AE.%E5%9C%A8%E7%BA%BF?Subject=%E4%BD%A0%E5%A5%BD%E5%B0%8F%E5%B1%B1%F0%9F%98%9C> schappo.blogspot.co.uk<https://schappo.blogspot.co.uk> twitter.com/andreschappo<https://twitter.com/andreschappo> weibo.com/andreschappo?is_all=1<https://weibo.com/andreschappo?is_all=1> groups.google.com/forum/#!forum/computer-science-curriculum-internationalization<https://groups.google.com/forum/#!forum/computer-science-curriculum-internat...>
On Tue, Dec 25, 2018 at 04:58:20PM -0500, John Levine wrote:
I think it's pretty safe to assume that foo.com and www.foo.com are in the same language
I don't think it's safe to assume that at all. There is in general no way to know any of that, and the potential for exceptions is precisely where phishers will drop their lures.
None of this is terribly hard, but it's not automatic either.
Quite.
Same answer, except that if one name isn't a subdomain of the other, the login and option cookie problems are a lot harder.
In some cases, for "impossible" values of "harder". This is important, because the total lack of general cross-tree linkage support in the DNS is something the IETF determined it had failed to undertake some while ago, which means that we cannot expect the situation will get better (despite Yet Another Effort by Dave Crocker).
The point, which I apparently wrongly thought was obvious, is that none of this multi-name stuff works automatically, and telling people "just add a bunch of IDN names and EAI addresses" is not going to end well.
Even if it isn't obvious, I'd have thought that the arguments to that effect of the now-aged Variant Issues Report from ICANN (full disclosure: I wrote most of it) were pretty complete. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
Andre, et al 1. ICANN Adopting a non-English domain name is not a UA issue. 2. ICANN is planning on providing EAI ready email, but probably not till the 2nd half of 2020. 3. They are working to make all their systems UA Capable. The are nearly able to accept all ASCII TLDs in 100% of their applications now. They are working through accepting all IDNs. No ETA yet available. Don From: UA-discuss <ua-discuss-bounces@icann.org> On Behalf Of Andre Schappo Sent: Tuesday, 25 December 2018 2:35 AM To: ua-discuss <UA-discuss@icann.org> Subject: Re: [UA-discuss] More from Ram Mohan on ICANN's further commitment to Universal Acceptance Will ICANNʼs further commitment to UA extend to ICANN having their own set of IDNs and EAI addresses? This is a topic myself and others have raised over the years but there always seems to be reluctance on the part of ICANN to having their own IDNs and EAI addresses. I have never understood this reluctance/resistance. André Schappo On 20 Dec 2018, at 09:09, don hollander <don.hollander@icann.org<mailto:don.hollander@icann.org>> wrote: https://uasg.tech/2018/12/icann-further-commits-to-universal-acceptance-of-d...
participants (11)
-
Andre Schappo -
Andrew Sullivan -
Asmus Freytag -
Asmus Freytag (c) -
Don Hollander -
Dr. Ajay Data -
Jiankang Yao -
John Levine -
Leo Vegoda -
Richard Merdinger -
Tex