emoji domain names with the puny package
Typical sunday night, lost in several inception layers of I don’t know how I got here, what I am doing here and what I was looking for in the first place.
Wait. We can do that ? https://t.co/FAKpNnhTFT pic.twitter.com/oqoEIxNQRv
— Romain François 🦄 (@romain_francois) February 25, 2018
So, some extensions allow arbitrary utf-8 characters in the domain name, but more importantly you can have arbitrary characters in sub domains. Next thing you know, obviously I’m not going to stop at accents or cedilla, I’m on a mission to spread emojis everywhere.
Emojis are characters, just sequence of utf-8 encoded bytes. I’ve been
playing with the emo
package to make it
easy to include and extract emojis, and the
utf8splain
to extract some information
about unicode runes (aka code points).
(s <- emo::ji_glue( "emojis :party: "))
## emojis 🎉
unclass(s)
## [1] "emojis \U0001f389 "
utf8splain::runes(s)
## utf-8 encoded string with 9 runes
##
## U+0065 65 01100101 Latin Small Letter E
## U+006D 6D 01101101 Latin Small Letter M
## U+006F 6F 01101111 Latin Small Letter O
## U+006A 6A 01101010 Latin Small Letter J
## U+0069 69 01101001 Latin Small Letter I
## U+0073 73 01110011 Latin Small Letter S
## U+0020 20 00100000 Space
## U+1F389 F0 9F 8E 89 11110000 10011111 10001110 10001001 Party Popper
## U+0020 20 00100000 Space
Typically domain names are only made of boring ascii, but punycode gives a way to encode a string that may contain any unicode characters into a string that only has ascii.
There are online tools to perform the encoding and decoding, and firing the github search engine, I could quickly find the simple enough punycode C library.
I knew this was going to change my day, I only wish we were able to open
the door of my coworking space space
without the help of a locksmith, but I guess that’s a story for another day,
and I had not agreed to install the updates on my mac. Anyway, once the elements
would finally give me a break, I was able to wrap the library in the
puny package and its code
and decode
functions.
puny::code( "crème brûlée" )
## [1] "crme brle-13ar8s"
puny::code( emo::ji_glue("emojis :party: everywhere") )
## [1] "emojis everywhere-3q59q"
If you add domain=TRUE
, the function adds the xn--
prefix and makes
something suitable for a domain name.
puny::code( emo::ji("package"), domain = TRUE )
## [1] "xn--cu8h"
So that for example 📦.purrple.cat
is (for now) a skeleton hugo site generated by blogdown
and deployed on netlify. Eventually, I guess
it will contain pkgdown sites for my
packages, but I could not get pkgdown to work today,
although I have not tried after the issue was fixed … a nice side effect is that
I could discover that pkgdown
uses my highlight
package and I was able to offer a pull request
because pkgdown
was using the interface of highlight
before I
broke it (for the greater good) last summer.
Aaaaaaanyway, to do this, I’ve had to use the encoded name (xn–cu8h) in both netlify:
… and the DNS settings on my registrar:
xn--cu8h 1800 IN CNAME dreamy-hypatia-b7499e.netlify.com.
Unfortunately, all browsers don’t treat punycode the same way. On safari, everything looks fine, I can browse to 📦.purrple.cat and that’s how it looks like.
But chrome (at least the version I have) only 🙍 echoes the encoded subdomain.