SAC095 - SSAC Advisory on the Use of Emoji in Domain
On 5/28/2017 2:26 AM, Don Hollander wrote:
FYI
Looks like figure 2 is not emoji, but mojibake. Does it render correctly on anybody's system?
mojibake for me also. On 28 May 2017, at 15:51, Asmus Freytag <asmusf@ix.netcom.com<mailto:asmusf@ix.netcom.com>> wrote: On 5/28/2017 2:26 AM, Don Hollander wrote: FYI https://www.icann.org/en/system/files/files/sac-095-en.pdf Looks like figure 2 is not emoji, but mojibake. Does it render correctly on anybody's system? <dfnojpbljnfolglk.png>
Mojibake indeed! Here’s how it renders on Microsoft Edge on the latest Windows 10. [cid:image001.jpg@01D2DA50.9864D6F0] From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Asmus Freytag Sent: Sunday, May 28, 2017 7:52 AM To: ua-discuss@icann.org Subject: Re: [UA-discuss] SAC095 - SSAC Advisory on the Use of Emoji in Domain On 5/28/2017 2:26 AM, Don Hollander wrote: FYI https://www.icann.org/en/system/files/files/sac-095-en.pdf<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.icann.org%2Fen%2Fsystem%2Ffiles%2Ffiles%2Fsac-095-en.pdf&data=02%7C01%7Cmarksv%40microsoft.com%7C93c8c33f269a4455159a08d4a5d9261d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C636315799487801862&sdata=%2FRe30luox%2BZ9Bv%2Fa2ke6mU%2FYs6WjLiSg5eeFmxoOx9A%3D&reserved=0> Looks like figure 2 is not emoji, but mojibake. Does it render correctly on anybody's system? [cid:image002.png@01D2DA50.9864D6F0]
Interesting reference. A wide variety of items that raise more questions for me than provide answers. Sharing the listing of those points in case it is of use. Given the conversations we’ve had here around phishing, would we say any aspect of Unicode meets the requirements of Finding 2: …are not required by design, standard, or convention to be visually uniform (one code point displayed the same way in all circumstances) or visually distinguishable (different code points displayed in ways that permit them to be disambiguated regardless of context). Beyond even that fundamental consideration, as pointed out on another branch, the use of the ZWJ seems completely independently of any discussion of Emoji. Yet it’s necessary for some writing systems. Two other related areas would be worth evaluating for similar risk would be precomposed versus combining diacritics for symbols such as é and IVSes. As to the point about accessibility, the Unicode CLDR data does indeed include information about how to reference Emoji properly. They have both a Unicode character name and pre-defined keywords. Furthermore, the comparison to other languages being more accessible is false; many modern living languages have poor accessibility implementations (non-space delimited languages and Chinese dialects come to mind). The point raised about the skin tone implementation and color-blind individuals is (pardon the pun) a red-herring. The emojis are designed to be distinguishable based on modern accessibility standards. FWIW one opinion is worth, I disagree with the assertion that adding emoji will slow the move towards universal acceptance. Certainly within software products, we’re seeing emoji as one of the forces driving a more robust support of the full Unicode standard and rendering in ways that make emoji useful in content. From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Don Hollander Sent: Sunday, May 28, 2017 2:27 AM To: ua-discuss@icann.org Subject: [UA-discuss] SAC095 - SSAC Advisory on the Use of Emoji in Domain FYI https://www.icann.org/en/system/files/files/sac-095-en.pdf [cid:image001.png@01D2D856.BE05BD80]
Hi, On Mon, May 29, 2017 at 03:39:36PM +0000, Stuart Stuple via UA-discuss wrote:
Given the conversations we’ve had here around phishing, would we say any aspect of Unicode meets the requirements of Finding 2: …are not required by design, standard, or convention to be visually uniform (one code point displayed the same way in all circumstances) or visually distinguishable (different code points displayed in ways that permit them to be disambiguated regardless of context).
I think you're missing the point of this remark in the document. The non-specification of visial forms in Unicode for letters is not the same as it is for emoji, because whereas Unicode does not specify fonts the implementation of emojis is quite a bit less constrained -- it's actually expected to diverge not just by font, but by OS and so on. (The basic problem here is that, whereas "font" is well-defined, emoji presentation is not yet. This is why even things like smiley face have quite large variations even in "the same" font.)
Beyond even that fundamental consideration, as pointed out on another branch, the use of the ZWJ seems completely independently of any discussion of Emoji. Yet it’s necessary for some writing systems. Two other related areas would be worth evaluating for similar risk would be precomposed versus combining diacritics for symbols such as é and IVSes.
Not exactly: ZWJ and ZWNJ is not subject to any predictable rules with emojis because emojis are not letters. We can therefore write CONTEXTJ rules for ZW[N]J on letters in a way we can't on emoji. The combining diacritics remark above is similarly wide of the mark, because the problem with combining diacritics vs precomposed forms is normally sorted out by normalization (and remember, all U-labels are required to be in NFC). But there is no normalization for emojis, which is an important part of the reason that SSAC is pointing out they are poorly suited for identifiers. Again, "smiley face" is instructive. The differences in presentation among fixed-width vs. variable-width font e-with-acute are solved by the code point: it's the same one all the time. The differences in e-plus-combining-acute vs e-with-acute are solved by NFC. But the differences among all the various smiley faces are actually because of using different code points, but you get one of them based on your rendering engine and so on, and there's no way to normalize them all to the same "smiley face" thing. That's not a problem for humans when communicating casually. It's a big deal when the same code points are used as network identifiers.
The point raised about the skin tone implementation and color-blind individuals is (pardon the pun) a red-herring. The emojis are designed to be distinguishable based on modern accessibility standards.
But we're not only talking to humans; we're talking to computers, and they need exact match. And none of what you say addresses the basic problem that emojis were excluded from IDNA because of their Unicode properties: this is _Unicode's_ advice we're following.
FWIW one opinion is worth, I disagree with the assertion that adding emoji will slow the move towards universal acceptance. Certainly within software products, we’re seeing emoji as one of the forces driving a more robust support of the full Unicode standard and rendering in ways that make emoji useful in content.
But in the above, you are failing to distinguish between "content" and "identifiers". Domain names are the latter, and if you want to argue that they aren't any more then you have a bigger problem than universal acceptance. You have a mismatch with the definition of the thing you want to be universally accepted. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
I don't mean to be arguing. I'm simply sharing my perspective as to why these guidelines are unworkable given typical customer goals and customer understanding of text. I understand the perspective that emoji are more variable than text as the emotional impact of the variations of emoji. However, as someone involved in font development, I am confident that the guidelines for Emoji are as well developed as those for character glyphs. I think folks believe that character forms are standard but then encounter cases like the Macedonian Cyrillic italics that are valid variants. Emojis do indeed have normalization as much as any other Unicode combination. While it's true that normalization can resolve difference for things like precomposed versus combining diacritics that presumes that the same normalization routine is being used. The classic case that we encounter in our software is normalization of casing -- works great, right up until you have plain text that includes Turkish i or I. I agree that the simplification of thinking of a "smiley face" is part of the problem. As I mention, each Unicode emoji has a Unicode character name -- they are not at all ambiguous. My point is less that emojis are good (though I think customers clearly expect and want them as identifiers) but rather than the same problems outline already exist for the encoding of many living languages. -----Original Message----- From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Monday, May 29, 2017 8:56 AM To: ua-discuss@icann.org Subject: Re: [UA-discuss] SAC095 - SSAC Advisory on the Use of Emoji in Domain Hi, On Mon, May 29, 2017 at 03:39:36PM +0000, Stuart Stuple via UA-discuss wrote:
Given the conversations we’ve had here around phishing, would we say any aspect of Unicode meets the requirements of Finding 2: …are not required by design, standard, or convention to be visually uniform (one code point displayed the same way in all circumstances) or visually distinguishable (different code points displayed in ways that permit them to be disambiguated regardless of context).
I think you're missing the point of this remark in the document. The non-specification of visial forms in Unicode for letters is not the same as it is for emoji, because whereas Unicode does not specify fonts the implementation of emojis is quite a bit less constrained -- it's actually expected to diverge not just by font, but by OS and so on. (The basic problem here is that, whereas "font" is well-defined, emoji presentation is not yet. This is why even things like smiley face have quite large variations even in "the same" font.)
Beyond even that fundamental consideration, as pointed out on another branch, the use of the ZWJ seems completely independently of any discussion of Emoji. Yet it’s necessary for some writing systems. Two other related areas would be worth evaluating for similar risk would be precomposed versus combining diacritics for symbols such as é and IVSes.
Not exactly: ZWJ and ZWNJ is not subject to any predictable rules with emojis because emojis are not letters. We can therefore write CONTEXTJ rules for ZW[N]J on letters in a way we can't on emoji. The combining diacritics remark above is similarly wide of the mark, because the problem with combining diacritics vs precomposed forms is normally sorted out by normalization (and remember, all U-labels are required to be in NFC). But there is no normalization for emojis, which is an important part of the reason that SSAC is pointing out they are poorly suited for identifiers. Again, "smiley face" is instructive. The differences in presentation among fixed-width vs. variable-width font e-with-acute are solved by the code point: it's the same one all the time. The differences in e-plus-combining-acute vs e-with-acute are solved by NFC. But the differences among all the various smiley faces are actually because of using different code points, but you get one of them based on your rendering engine and so on, and there's no way to normalize them all to the same "smiley face" thing. That's not a problem for humans when communicating casually. It's a big deal when the same code points are used as network identifiers.
The point raised about the skin tone implementation and color-blind individuals is (pardon the pun) a red-herring. The emojis are designed to be distinguishable based on modern accessibility standards.
But we're not only talking to humans; we're talking to computers, and they need exact match. And none of what you say addresses the basic problem that emojis were excluded from IDNA because of their Unicode properties: this is _Unicode's_ advice we're following.
FWIW one opinion is worth, I disagree with the assertion that adding emoji will slow the move towards universal acceptance. Certainly within software products, we’re seeing emoji as one of the forces driving a more robust support of the full Unicode standard and rendering in ways that make emoji useful in content.
But in the above, you are failing to distinguish between "content" and "identifiers". Domain names are the latter, and if you want to argue that they aren't any more then you have a bigger problem than universal acceptance. You have a mismatch with the definition of the thing you want to be universally accepted. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
Hi, On Mon, May 29, 2017 at 04:12:35PM +0000, Stuart Stuple wrote:
I don't mean to be arguing. I'm simply sharing my perspective as to why these guidelines are unworkable given typical customer goals and customer understanding of text.
I don't mean to be arguing either, in the sense of being disagreeable, but I think that those of us interested in this need to work out what we think the goal of acceptance of domain names is. Because if the goal is the same as "rendering of text the same as running text for readers", then I'm pretty sure we have some deep disagreements.
I understand the perspective that emoji are more variable than text as the emotional impact of the variations of emoji.
No, this has nothing to do with the emotional impact of the variations of emoji. It has to do with what is considered to be "the same."
However, as someone involved in font development, I am confident that the guidelines for Emoji are as well developed as those for character glyphs.
But it's not just _glyphs_ that count here. If you run U+0061, LATIN SMALL LETTER A, through NFC, _no matter what the font is_ you get U+0061. If you put a grave accent on it, _no matter how you get that on there_, when you're done with NFC you get the precomposed form, U+00E0 LATIN SMALL LETTER A WITH GRAVE. But U+1F466 BOY is Emoji_Modifier_Base, so it takes the skin tone modifiers. So if you add U+1F3FF to U+1F466, you get a new combining sequence. But U+1F466 is NFC, so it doesn't normalize with other modifiers of Emoji_Modifier_Base characters, which means that if someone just reads "BOY" when reading that sequence, information is lost. So either we have to train everyone in the world to see racialized emojis, which at the very least seems like a rather contentious idea, or else we need to create normalization rules, which will break Unicode's promises about normalization stability.
Emojis do indeed have normalization as much as any other Unicode combination.
This is either trivially true (in that all code points have NF* properties) or it misses the point of the concern.
presumes that the same normalization routine is being used. The classic case that we encounter in our software is normalization of casing -- works great, right up until you have plain text that includes Turkish i or I.
That's not "normalization" in Unicode terms, it's case folding or maybe downcasing. The _reason_ the IDNA rules are so complicated around this is precisely because of the kinds of rules you're highlighting: localized software actually can do more interesting things with case folding if you know the locale. Since domain names don't have locale information with them, you're in rather big trouble here, which is why IDNA2003's approach to this turned out not to work. So in IDNA2008 the specification suggests local case handling and requires stability under caseFold and NFKC and also requires strings to be in NFC.
I agree that the simplification of thinking of a "smiley face" is part of the problem. As I mention, each Unicode emoji has a Unicode character name -- they are not at all ambiguous.
And literally no human who is not familiar with Unicode uses those character names. For instance, did you know that U+ 1F623 is PERSEVERING FACE and U+1F616 is CONFOUNDED FACE? How would you know? I'd say maybe "squinty eyed" and "upset squinty eyed", but I have no idea -- it'd probably depend on context. And it is _context_ that is precisely the problematically missing thing in free-floating network identifiers, which is what domain names are intended to be. It is that ambiguity of meaning that has people using peaches and eggplants to represent things that are probably not what's for dinner. This is perfectly useful for casual communications, and disastrous for supposedly unique and minimally ambiguous identifiers. That's what the SSAC report is about.
My point is less that emojis are good (though I think customers clearly expect and want them as identifiers) but rather than the same problems outline already exist for the encoding of many living languages.
But living languages have not been encoded by Unicode as symbols. They've been encoded by Unicode as letters. That's the difference. People want lots of things. They might, for instance, want apostrophes or mixed scripts in domain names, too. But they're a bad idea because they break stuff. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
On 5/29/2017 9:12 AM, Stuart Stuple via UA-discuss wrote:
I understand the perspective that emoji are more variable than text as the emotional impact of the variations of emoji.
The "emo" in "emoji" has absolutely nothing to do with the word (or the concept of) "emotion". This was rather cleanly summarized in the SSAC paper, which immediately leads to the question of whether anyone in this discussion has read more than the headline on that paper.... A./
Thought I had. <grin> I continue to believe - despite the excellent clarity of opinions presented - that emoji should be considered approprIate for indentifiers. I don't see any of the points raised in the article as compelling. But I am completely willing to accept that is my ignorance. -Stuart The smaller keyboard and use of voice recognition may increase my incoherence. Apologies. Feel free to ask for clarification of my word choices. Get Outlook<https://aka.ms/qtex0l> for iOS On Mon, May 29, 2017 at 2:18 PM -0700, "Asmus Freytag" <asmusf@ix.netcom.com<mailto:asmusf@ix.netcom.com>> wrote: On 5/29/2017 9:12 AM, Stuart Stuple via UA-discuss wrote: I understand the perspective that emoji are more variable than text as the emotional impact of the variations of emoji. The "emo" in "emoji" has absolutely nothing to do with the word (or the concept of) "emotion". This was rather cleanly summarized in the SSAC paper, which immediately leads to the question of whether anyone in this discussion has read more than the headline on that paper.... A./
In case you haven't seen this response to the SSAC Advisory: https://medium.com/@Emoji_Domains/ssac-response-d8d2ad6e800c (also available at I❤️ICANN.ws <http://I%E2%9D%A4%EF%B8%8FICANN.ws>). satish On Tue, May 30, 2017 at 3:09 AM, Stuart Stuple via UA-discuss < ua-discuss@icann.org> wrote:
Thought I had. <grin>
I continue to believe - despite the excellent clarity of opinions presented - that emoji should be considered approprIate for indentifiers. I don't see any of the points raised in the article as compelling.
But I am completely willing to accept that is my ignorance.
-Stuart
The smaller keyboard and use of voice recognition may increase my incoherence. Apologies. Feel free to ask for clarification of my word choices.
Get Outlook <https://aka.ms/qtex0l> for iOS
On Mon, May 29, 2017 at 2:18 PM -0700, "Asmus Freytag" < asmusf@ix.netcom.com> wrote:
On 5/29/2017 9:12 AM, Stuart Stuple via UA-discuss wrote:
I understand the perspective that emoji are more variable than text as the emotional impact of the variations of emoji.
The "emo" in "emoji" has absolutely nothing to do with the word (or the concept of) "emotion".
This was rather cleanly summarized in the SSAC paper, which immediately leads to the question of whether anyone in this discussion has read more than the headline on that paper....
A./
If only it were actually a response, instead of a post saying, "Neener, neener, emojis are cool, there's no problem. Well, maybe we need a whitelist or something." A -- Andrew Sullivan Please excuse my clumbsy thums.
On May 29, 2017, at 21:07, Satish Babu <sb@inapp.com> wrote:
In case you haven't seen this response to the SSAC Advisory:
https://medium.com/@Emoji_Domains/ssac-response-d8d2ad6e800c (also available at I❤️ICANN.ws).
satish
On Tue, May 30, 2017 at 3:09 AM, Stuart Stuple via UA-discuss <ua-discuss@icann.org> wrote: Thought I had. <grin>
I continue to believe - despite the excellent clarity of opinions presented - that emoji should be considered approprIate for indentifiers. I don't see any of the points raised in the article as compelling.
But I am completely willing to accept that is my ignorance.
-Stuart
The smaller keyboard and use of voice recognition may increase my incoherence. Apologies. Feel free to ask for clarification of my word choices.
Get Outlook for iOS
On Mon, May 29, 2017 at 2:18 PM -0700, "Asmus Freytag" <asmusf@ix.netcom.com> wrote:
On 5/29/2017 9:12 AM, Stuart Stuple via UA-discuss wrote: I understand the perspective that emoji are more variable than text as the emotional impact of the variations of emoji. The "emo" in "emoji" has absolutely nothing to do with the word (or the concept of) "emotion".
This was rather cleanly summarized in the SSAC paper, which immediately leads to the question of whether anyone in this discussion has read more than the headline on that paper....
A./
FWIW one opinion is worth, I disagree with the assertion that adding emoji will slow the move towards universal acceptance. Certainly within software products, we’re seeing emoji as one of the forces driving a more robust support of the full Unicode standard and rendering in ways that make emoji useful in content.
EVERY opinion is valuable. I had proposed that we discuss leveraging the attention Emoji are getting in attracting that attention to addressing EAI or UA bugs in forms or software at the Redmond meeting, and there seemed a loose consensus that it was best to focus on the matters at hand. My opinion is that with respect to Emoji domains, I view it as 'cute' and 'clever' to have these function in the location bar, but suspect that because there are so many sober and important issues like making domains work in the natural language of underserved cultures and languages, that discussion on Emoji could be heavily distracting us from addressing a massive debt of attention to catch up on with respect to the existing issues. The existing and most current IDNA specification did not include Emoji support because the Unicode consortium suggested it. Now we have a recommendation from the SSAC. I would like to propose that we follow Emoji standards as they might evolve, but we focus on the non-Emoji work we have all been working on and not get distracted by Emoji for the time being. There are a number of very practical benefits from functional EAI and IDN for a larger world of people, and websites that still reject valid domain names. -Jothan
Yes, at the meeting the consensus was that UASG could postpone working on emoji issues, at least for now, to focus on EAI and IDN. That said, Stuart and I will meet with some internal stakeholders and decide what Microsoft’s viewpoint should be on emoji. We will report back when we are ready. From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Jothan Frakes Sent: Tuesday, May 30, 2017 2:44 PM To: Stuart Stuple <stuartst@microsoft.com> Cc: ua-discuss@icann.org Subject: Re: [UA-discuss] SAC095 - SSAC Advisory on the Use of Emoji in Domain FWIW one opinion is worth, I disagree with the assertion that adding emoji will slow the move towards universal acceptance. Certainly within software products, we’re seeing emoji as one of the forces driving a more robust support of the full Unicode standard and rendering in ways that make emoji useful in content. EVERY opinion is valuable. I had proposed that we discuss leveraging the attention Emoji are getting in attracting that attention to addressing EAI or UA bugs in forms or software at the Redmond meeting, and there seemed a loose consensus that it was best to focus on the matters at hand. My opinion is that with respect to Emoji domains, I view it as 'cute' and 'clever' to have these function in the location bar, but suspect that because there are so many sober and important issues like making domains work in the natural language of underserved cultures and languages, that discussion on Emoji could be heavily distracting us from addressing a massive debt of attention to catch up on with respect to the existing issues. The existing and most current IDNA specification did not include Emoji support because the Unicode consortium suggested it. Now we have a recommendation from the SSAC. I would like to propose that we follow Emoji standards as they might evolve, but we focus on the non-Emoji work we have all been working on and not get distracted by Emoji for the time being. There are a number of very practical benefits from functional EAI and IDN for a larger world of people, and websites that still reject valid domain names. -Jothan
One of the recommendations I am not in agreement with is in their example of usage of html in footnote 9 For example, clickable HyperText Markup Language (HTML) anchor text (e.g., the "I❤NY" in the HTML expression "<a href=“https://www.iloveny.example”>I❤NY</a>") would not be governed by IDNA2008, nor would a search term typed into a web search engine. Firstly, let me revive an acronym I have not used for many years - WYSIWYG - What You See Is What You Get My standard practice is to make, whenever possible, my links WYSIWYG. I think it a good practice. Sometimes it is not possible because of overly long and complex URLs. Ignoring the fact that Emoji are not currently allowed in IDNA2008, my version of the above anchor would be <a href="https://I❤NY.example<https://I%E2%9D%A4NY.example>">I❤NY.example</a> My other standard practice is to drop the www as it is unnecessary (with a properly setup web server, the majority of which are setup correctly) and is inconsistent with IDNs. I consider one of the aims of this group is that IDN links should be WYSIWYG. So no redirecting to ASCII and no displaying of punycode to users. André Schappo On 28 May 2017, at 10:26, Don Hollander <don.hollander@icann.org<mailto:don.hollander@icann.org>> wrote: FYI https://www.icann.org/en/system/files/files/sac-095-en.pdf
On Wed, May 31, 2017 at 03:49:21PM +0000, Andre Schappo wrote:
My standard practice is to make, whenever possible, my links WYSIWYG. I think it a good practice. Sometimes it is not possible because of overly long and complex URLs.
It's never actually been a recommendation from hypertext people, however. They've always suggested that you should put links liberally in running text that is in itself nicely readable. So, <a href="target">In a previous post</a>, we discussed UA… as opposed to In a previous post, which you can find at <a href="target">target</a>, we discussed UA … Why do you think it's a good practice? It makes for very stilted text. A -- Andrew Sullivan ajs@anvilwalrusden.com
On 31 May 2017, at 16:59, Andrew Sullivan <ajs@anvilwalrusden.com> wrote:
On Wed, May 31, 2017 at 03:49:21PM +0000, Andre Schappo wrote:
My standard practice is to make, whenever possible, my links WYSIWYG. I think it a good practice. Sometimes it is not possible because of overly long and complex URLs.
It's never actually been a recommendation from hypertext people, however. They've always suggested that you should put links liberally in running text that is in itself nicely readable. So,
<a href="target">In a previous post</a>, we discussed UA…
as opposed to
In a previous post, which you can find at <a href="target">target</a>, we discussed UA …
Why do you think it's a good practice? It makes for very stilted text.
A
User reassurance - knowing the exact address of the website they will visit if they click the link. Transparency - stating clearly and exactly the address of the website they will visit if they click the link. User feedback - Users can visually verify that the address of the website they land on after clicking the link is indeed what was stated. I consider it makes for better security because the address is upfront for visual inspection/examination and not hidden behind some text string/image. There is much discussion/arguments on IDNs and phishing/spoofing because of, for instance, confusables. I consider spoofing/phishing is more easily achieved with links hiding behind text/images without going to the effort of employing and registering IDNs containing confusables. eg <a href="http://WeWillStealYourMoney.com">the honest and genuine bank<a> I too used to hide links behind text/images but for about 4/5 years now I have been making links explicit as I consider it better security and better practice. One way in which I retain reading flow is to treat the link as a full stop ie terminating a sentence. Also, one can use links in a similar manner to the way citations are used in academic papers André Schappo
Your explanation makes sense to me. -----Original Message----- From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andre Schappo Sent: Wednesday, May 31, 2017 9:49 AM To: ua-discuss@icann.org Subject: Re: [UA-discuss] SAC095 - SSAC Advisory on the Use of Emoji in Domain
On 31 May 2017, at 16:59, Andrew Sullivan <ajs@anvilwalrusden.com> wrote:
On Wed, May 31, 2017 at 03:49:21PM +0000, Andre Schappo wrote:
My standard practice is to make, whenever possible, my links WYSIWYG. I think it a good practice. Sometimes it is not possible because of overly long and complex URLs.
It's never actually been a recommendation from hypertext people, however. They've always suggested that you should put links liberally in running text that is in itself nicely readable. So,
<a href="target">In a previous post</a>, we discussed UA…
as opposed to
In a previous post, which you can find at <a href="target">target</a>, we discussed UA …
Why do you think it's a good practice? It makes for very stilted text.
A
User reassurance - knowing the exact address of the website they will visit if they click the link. Transparency - stating clearly and exactly the address of the website they will visit if they click the link. User feedback - Users can visually verify that the address of the website they land on after clicking the link is indeed what was stated. I consider it makes for better security because the address is upfront for visual inspection/examination and not hidden behind some text string/image. There is much discussion/arguments on IDNs and phishing/spoofing because of, for instance, confusables. I consider spoofing/phishing is more easily achieved with links hiding behind text/images without going to the effort of employing and registering IDNs containing confusables. eg <a href="https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2FWeWillStealYourMoney.com&data=02%7C01%7Cmarksv%40microsoft.com%7C579786462f804689d6ed08d4a845192e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636318462118900393&sdata=I3WPZk7CBmK5Jldlr4VCaqySuyk80nHfjEidaFOYajw%3D&reserved=0">the honest and genuine bank<a> I too used to hide links behind text/images but for about 4/5 years now I have been making links explicit as I consider it better security and better practice. One way in which I retain reading flow is to treat the link as a full stop ie terminating a sentence. Also, one can use links in a similar manner to the way citations are used in academic papers André Schappo
I think emoji is fascinating and potentially interesting to watch, and am not suggesting that emoji, once a bit more settled and standardized by those respective groups, may drift into our orbit. Rather, I am suggesting that we not allow it to distract us while working the existing issues. On May 31, 2017 09:51, "Mark Svancarek via UA-discuss" <ua-discuss@icann.org> wrote:
Your explanation makes sense to me.
-----Original Message----- From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andre Schappo Sent: Wednesday, May 31, 2017 9:49 AM To: ua-discuss@icann.org Subject: Re: [UA-discuss] SAC095 - SSAC Advisory on the Use of Emoji in Domain
On 31 May 2017, at 16:59, Andrew Sullivan <ajs@anvilwalrusden.com> wrote:
On Wed, May 31, 2017 at 03:49:21PM +0000, Andre Schappo wrote:
My standard practice is to make, whenever possible, my links WYSIWYG. I think it a good practice. Sometimes it is not possible because of overly long and complex URLs.
It's never actually been a recommendation from hypertext people, however. They've always suggested that you should put links liberally in running text that is in itself nicely readable. So,
<a href="target">In a previous post</a>, we discussed UA…
as opposed to
In a previous post, which you can find at <a href="target">target</a>, we discussed UA …
Why do you think it's a good practice? It makes for very stilted text.
A
User reassurance - knowing the exact address of the website they will visit if they click the link. Transparency - stating clearly and exactly the address of the website they will visit if they click the link. User feedback - Users can visually verify that the address of the website they land on after clicking the link is indeed what was stated.
I consider it makes for better security because the address is upfront for visual inspection/examination and not hidden behind some text string/image.
There is much discussion/arguments on IDNs and phishing/spoofing because of, for instance, confusables.
I consider spoofing/phishing is more easily achieved with links hiding behind text/images without going to the effort of employing and registering IDNs containing confusables.
eg <a href="https://na01.safelinks.protection.outlook.com/?url=http%3A%2F% 2FWeWillStealYourMoney.com&data=02%7C01%7Cmarksv%40microsoft.com% 7C579786462f804689d6ed08d4a845192e%7C72f988bf86f141af91ab2d7cd011 db47%7C1%7C0%7C636318462118900393&sdata=I3WPZk7CBmK5Jldlr4VCaqySuyk80n HfjEidaFOYajw%3D&reserved=0">the honest and genuine bank<a>
I too used to hide links behind text/images but for about 4/5 years now I have been making links explicit as I consider it better security and better practice. One way in which I retain reading flow is to treat the link as a full stop ie terminating a sentence. Also, one can use links in a similar manner to the way citations are used in academic papers
André Schappo
Dear all, First, +1 for Jothan. Exactly. Secondly, people can write and make written conversation without using any of the smileys or emoji like signs, by simply using their alphabet. And we have big number of issues with letters, on all fields. On the other hand, emoji is there mainly for two reasons> to express the feelings more quickly (but often not clearly) in chats (or live, written conversation of any kind), and for fun. Both reasons are not essential to us like using different alphabet in domain names, or using long TLD names. From my previous life as editor-in-chief, my rule was very simple – there are no smileys in texts. If you are an author and not expressed the opinion, meaning or joke right, smileys or emoji can’t help you and you must change this sentence/paragraph. Readers won’t lough because there is J at the end of the sentence, but because you wrote something funny. When (or maybe if) they come in our orbit, and we previously solve all essential issues, then we can consider it. Until then, let the ccTLDs, who “dare” to use emoji in their tables, solve possible issues. Cheers, Dusan From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Jothan Frakes Sent: Thursday, June 1, 2017 2:34 AM To: Mark Svancarek <marksv@microsoft.com> Cc: ua-discuss@icann.org Subject: Re: [UA-discuss] SAC095 - SSAC Advisory on the Use of Emoji in Domain I think emoji is fascinating and potentially interesting to watch, and am not suggesting that emoji, once a bit more settled and standardized by those respective groups, may drift into our orbit. Rather, I am suggesting that we not allow it to distract us while working the existing issues. On May 31, 2017 09:51, "Mark Svancarek via UA-discuss" <ua-discuss@icann.org <mailto:ua-discuss@icann.org> > wrote: Your explanation makes sense to me. -----Original Message----- From: ua-discuss-bounces@icann.org <mailto:ua-discuss-bounces@icann.org> [mailto:ua-discuss-bounces@icann.org <mailto:ua-discuss-bounces@icann.org> ] On Behalf Of Andre Schappo Sent: Wednesday, May 31, 2017 9:49 AM To: ua-discuss@icann.org <mailto:ua-discuss@icann.org> Subject: Re: [UA-discuss] SAC095 - SSAC Advisory on the Use of Emoji in Domain
On 31 May 2017, at 16:59, Andrew Sullivan <ajs@anvilwalrusden.com <mailto:ajs@anvilwalrusden.com> > wrote:
On Wed, May 31, 2017 at 03:49:21PM +0000, Andre Schappo wrote:
My standard practice is to make, whenever possible, my links WYSIWYG. I think it a good practice. Sometimes it is not possible because of overly long and complex URLs.
It's never actually been a recommendation from hypertext people, however. They've always suggested that you should put links liberally in running text that is in itself nicely readable. So,
<a href="target">In a previous post</a>, we discussed UA…
as opposed to
In a previous post, which you can find at <a href="target">target</a>, we discussed UA …
Why do you think it's a good practice? It makes for very stilted text.
A
User reassurance - knowing the exact address of the website they will visit if they click the link. Transparency - stating clearly and exactly the address of the website they will visit if they click the link. User feedback - Users can visually verify that the address of the website they land on after clicking the link is indeed what was stated. I consider it makes for better security because the address is upfront for visual inspection/examination and not hidden behind some text string/image. There is much discussion/arguments on IDNs and phishing/spoofing because of, for instance, confusables. I consider spoofing/phishing is more easily achieved with links hiding behind text/images without going to the effort of employing and registering IDNs containing confusables. eg <a href="https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2FWeWillStealY... <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2FWeWillStealY...> &data=02%7C01%7Cmarksv%40microsoft.com%7C579786462f804689d6ed08d4a845192e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636318462118900393&sdata=I3WPZk7CBmK5Jldlr4VCaqySuyk80nHfjEidaFOYajw%3D&reserved=0">the honest and genuine bank<a> I too used to hide links behind text/images but for about 4/5 years now I have been making links explicit as I consider it better security and better practice. One way in which I retain reading flow is to treat the link as a full stop ie terminating a sentence. Also, one can use links in a similar manner to the way citations are used in academic papers André Schappo --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus
I would encourage you to evaluate each of the points in that article as to how they relate to critical living languages. In our current work to support the full range of African languages, those issues (such as ZWJ) become critical. And, likewise, the rendering differences between the same Unicode point across scripts is a major issue in the Indic market and with italic Cyrillic. Whether an organization tries to limit Emoji is somewhat irrelevant. The key is that those very same issues need to be solved at the text stack level for many living languages to truly work as EAI or IDNs. -Stuart From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Dusan Stojicevic Sent: Thursday, June 1, 2017 6:18 AM To: 'Jothan Frakes' <jothan@gmail.com>; Mark Svancarek <marksv@microsoft.com> Cc: ua-discuss@icann.org Subject: Re: [UA-discuss] SAC095 - SSAC Advisory on the Use of Emoji in Domain Dear all, First, +1 for Jothan. Exactly. Secondly, people can write and make written conversation without using any of the smileys or emoji like signs, by simply using their alphabet. And we have big number of issues with letters, on all fields. On the other hand, emoji is there mainly for two reasons> to express the feelings more quickly (but often not clearly) in chats (or live, written conversation of any kind), and for fun. Both reasons are not essential to us like using different alphabet in domain names, or using long TLD names. From my previous life as editor-in-chief, my rule was very simple – there are no smileys in texts. If you are an author and not expressed the opinion, meaning or joke right, smileys or emoji can’t help you and you must change this sentence/paragraph. Readers won’t lough because there is ☺ at the end of the sentence, but because you wrote something funny. When (or maybe if) they come in our orbit, and we previously solve all essential issues, then we can consider it. Until then, let the ccTLDs, who “dare” to use emoji in their tables, solve possible issues. Cheers, Dusan From: ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org> [mailto:ua-discuss-bounces@icann.org] On Behalf Of Jothan Frakes Sent: Thursday, June 1, 2017 2:34 AM To: Mark Svancarek <marksv@microsoft.com<mailto:marksv@microsoft.com>> Cc: ua-discuss@icann.org<mailto:ua-discuss@icann.org> Subject: Re: [UA-discuss] SAC095 - SSAC Advisory on the Use of Emoji in Domain I think emoji is fascinating and potentially interesting to watch, and am not suggesting that emoji, once a bit more settled and standardized by those respective groups, may drift into our orbit. Rather, I am suggesting that we not allow it to distract us while working the existing issues. On May 31, 2017 09:51, "Mark Svancarek via UA-discuss" <ua-discuss@icann.org<mailto:ua-discuss@icann.org>> wrote: Your explanation makes sense to me. -----Original Message----- From: ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org> [mailto:ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org>] On Behalf Of Andre Schappo Sent: Wednesday, May 31, 2017 9:49 AM To: ua-discuss@icann.org<mailto:ua-discuss@icann.org> Subject: Re: [UA-discuss] SAC095 - SSAC Advisory on the Use of Emoji in Domain
On 31 May 2017, at 16:59, Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> wrote:
On Wed, May 31, 2017 at 03:49:21PM +0000, Andre Schappo wrote:
My standard practice is to make, whenever possible, my links WYSIWYG. I think it a good practice. Sometimes it is not possible because of overly long and complex URLs.
It's never actually been a recommendation from hypertext people, however. They've always suggested that you should put links liberally in running text that is in itself nicely readable. So,
<a href="target">In a previous post</a>, we discussed UA…
as opposed to
In a previous post, which you can find at <a href="target">target</a>, we discussed UA …
Why do you think it's a good practice? It makes for very stilted text.
A
User reassurance - knowing the exact address of the website they will visit if they click the link. Transparency - stating clearly and exactly the address of the website they will visit if they click the link. User feedback - Users can visually verify that the address of the website they land on after clicking the link is indeed what was stated. I consider it makes for better security because the address is upfront for visual inspection/examination and not hidden behind some text string/image. There is much discussion/arguments on IDNs and phishing/spoofing because of, for instance, confusables. I consider spoofing/phishing is more easily achieved with links hiding behind text/images without going to the effort of employing and registering IDNs containing confusables. eg <a href="https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2FWeWillStealYourMoney.com&data=02%7C01%7Cmarksv%40microsoft.com%7C579786462f804689d6ed08d4a845192e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636318462118900393&sdata=I3WPZk7CBmK5Jldlr4VCaqySuyk80nHfjEidaFOYajw%3D&reserved=0">the honest and genuine bank<a> I too used to hide links behind text/images but for about 4/5 years now I have been making links explicit as I consider it better security and better practice. One way in which I retain reading flow is to treat the link as a full stop ie terminating a sentence. Also, one can use links in a similar manner to the way citations are used in academic papers André Schappo [https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif]<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.avast.com%2Fsig-email%3Futm_medium%3Demail%26utm_source%3Dlink%26utm_campaign%3Dsig-email%26utm_content%3Demailclient%26utm_term%3Dicon&data=02%7C01%7Cstuartst%40exchange.microsoft.com%7C776b2801254140a4b4e208d4a8f0a230%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636319198846311540&sdata=G%2BaKb5LpJ59QaOLuWtPeBJ70SDaDUcxraKisH%2FRKvGQ%3D&reserved=0> Virus-free. www.avast.com<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.avast.c...>
On Wed, May 31, 2017 at 04:49:20PM +0000, Andre Schappo wrote:
User reassurance - knowing the exact address of the website they will visit if they click the link.
But given the semantics of markup languages, you _never_ get that assurance. Indeed, training people to believe the running text as opposed to the target of the link is giving them bad advice, because this is how phishing works.
User feedback - Users can visually verify that the address of the website they land on after clicking the link is indeed what was stated.
If they've already clicked through in a lot of phishing attacks, that's too late.
I consider spoofing/phishing is more easily achieved with links hiding behind text/images without going to the effort of employing and registering IDNs containing confusables.
eg <a href="http://WeWillStealYourMoney.com">the honest and genuine bank<a>
Or, of course, <a href="http://WeWillStealYourMoney.com">http://yoursafebankhere.com<a>. I don't care how you prefer to do this -- it's a stylistic preference -- but I don't believe it is or ought to be part of the UA goals.
Also, one can use links in a similar manner to the way citations are used in academic papers
That very stilted style was precisely what hypertext theorists were opposed to in the first place, though. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
Just go to any news site, and you'll see that they consistently follow the scheme that Andrew documents. They have dozens of links in their articles and it would completely destroy the readability to provide URLs. I tend to deliberately expose URLs in key situations: where the link is itself the content. Like I'm pointing out a resource, or giving an entry in a list of references. Here the focus is on letting people recover from minor link corruption (allowing them to easily search for part of a URL in order to try to locate a document that might have moved) or to be able to use a document even in printed form. But I regard those as not generally applicable principles so I would support not making it a goal for the UA effort to establish this kind of practice. A./ On 5/31/2017 10:00 AM, Andrew Sullivan wrote:
On Wed, May 31, 2017 at 04:49:20PM +0000, Andre Schappo wrote:
User reassurance - knowing the exact address of the website they will visit if they click the link. But given the semantics of markup languages, you _never_ get that assurance. Indeed, training people to believe the running text as opposed to the target of the link is giving them bad advice, because this is how phishing wIorks.
User feedback - Users can visually verify that the address of the website they land on after clicking the link is indeed what was stated. If they've already clicked through in a lot of phishing attacks, that's too late.
I consider spoofing/phishing is more easily achieved with links hiding behind text/images without going to the effort of employing and registering IDNs containing confusables.
eg <a href="http://WeWillStealYourMoney.com">the honest and genuine bank<a>
Or, of course, <a href="http://WeWillStealYourMoney.com">http://yoursafebankhere.com<a>.
I don't care how you prefer to do this -- it's a stylistic preference -- but I don't believe it is or ought to be part of the UA goals.
Also, one can use links in a similar manner to the way citations are used in academic papers
That very stilted style was precisely what hypertext theorists were opposed to in the first place, though.
Best regards,
A
Sometimes there is value to revealing the details of a URI - perhaps a document will be consumed offline, and the reader will be able to navigate by manually rekeying the address. Or perhaps I want to reveal the site structure to facilitate discovery of other resources. In those cases I would use Andrew's 2nd example. In the majority of cases, though, I would follow the first example. 0.02 -----Original Message----- From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Wednesday, May 31, 2017 9:00 AM To: ua-discuss@icann.org Subject: Re: [UA-discuss] SAC095 - SSAC Advisory on the Use of Emoji in Domain On Wed, May 31, 2017 at 03:49:21PM +0000, Andre Schappo wrote:
My standard practice is to make, whenever possible, my links WYSIWYG. I think it a good practice. Sometimes it is not possible because of overly long and complex URLs.
It's never actually been a recommendation from hypertext people, however. They've always suggested that you should put links liberally in running text that is in itself nicely readable. So, <a href="target">In a previous post</a>, we discussed UA… as opposed to In a previous post, which you can find at <a href="target">target</a>, we discussed UA … Why do you think it's a good practice? It makes for very stilted text. A -- Andrew Sullivan ajs@anvilwalrusden.com
participants (10)
-
Andre Schappo -
Andrew Sullivan -
Asmus Freytag -
Don Hollander -
Dusan Stojicevic -
Jothan Frakes -
Jothan Frakes -
Mark Svancarek -
Satish Babu -
Stuart Stuple