I was coding together with ChatGPT 4o today. It suggested me the following (in Lua):

-- Extract the TLD extension (e.g., 'com' from 'example.com')

local tldExtension = tldName:match("%.([a-zA-Z0-9]+)$")

To which I replied: "This is not IDN-compliant."

To which ChatGPT replied: "You're absolutely right! The pattern %.([a-zA-Z0-9]+)$ is not IDN-compliant because it does not account for Unicode characters in internationalized domain names (IDNs)."

And ChatGPT proceeded to suggest a list of several more UA-Ready approaches.

WHAT I TOOK AWAY FROM THIS: We need to continue making as many developers as possible aware that relying on these old RegEx is a bad idea. That way, even when LLMs insert the RegEx into the code, all it takes is one warning for the developer to be pointed in the right direction... but they need to know that this is a problem in the first place!

Best,

--
Mark W. Datysgeld
Director at Governance Primer [governanceprimer.com]
Project Lead Developer at ICANNWiki [icannwiki.org]