Emojis at #useR2017
Because of imminent birth of my 2nd daughter, I did not get to go to useR, so I’ve been feeling a mix of frustration, impatience and fomo. I was really pleased to find out that for the second year, part of the conference would be streamed live and most of it will be available later.
#user2017 will be livestreamed starting Weds. Tomorrow's tutorials will be recorded. Details and schedule: https://t.co/bqnH0pYs2R #rstats
— David Smith (@revodavid) July 3, 2017
I combined watching the live stream and constantly update the column with the #useR2017
filter on
my tweetbot, until this started to be overwhelming enough for me to come up with a way
to visualise the tweet storm from a few angles. This was a good opportunity
to learn the rtweet
package to grab the tweets, and
shinydashboard
for the visualisation.
A few hours and a few tweets later, we released this as the
pet project tweetstorm
, a shiny dashboard to glimpse activity on
twitter, to answer some questions:
- How many tweets from how manuy twitter users
- Related hashtags
- Which tweets were the most fav’ed or retweeted
- Who tweeted the most, was quoted or replied to the most
- …
Some of use use emojis in our tweets, and I’ve learned from Sean Kross how to extract them from the tweet, borrowing a regex from his article: Which Emojis Does Lucy Use in Commit Messages?
See: https://t.co/azdxMF9WtB
— Sean Kross (@seankross) July 6, 2017
And: https://t.co/jyt5JoMNyD
There are now two emojis things in the app, the first one to appear packs emojis together based on how many times they have
been used, through the tweetstorm::extract_emojis
function.
#' Extract emojis
#'
#' @param text text with emojis
#'
#' @return a tibble with the columns `Emoji` and `n`
#'
#' @references
#' Inspired from [](http://seankross.com/2017/05/30/Which-Emojis-Does-Lucy-Use-in-Commit-Messages.html)
#'
#' @export
#' @importFrom stringr str_extract_all str_split
#' @importFrom tibble as_tibble
#' @importFrom magrittr %>% set_names
extract_emojis <- function(text){
str_extract_all(text, emoji_regex ) %>%
flatten_chr() %>%
str_split("") %>%
flatten_chr() %>%
not_equal("-") %>%
table() %>%
sort(decreasing = TRUE) %>%
as_tibble() %>%
set_names( c("Emoji", "n") )
}
There’s still some work to deal with skin tone modifiers, but it’s great to see that the most popular emojis spread love and package use.
The other more emoji related display on the app groups them by users, so that we can verify
that Lucy also makes extensive use of emojis on twitter. That’s the job
of the tweetstorm::extract_emojis_users
function:
#' Extract information about users that use emojis
#'
#' @param tweets tweets data set
#' @export
extract_emojis_users <- function(tweets){
data <- tweets %>%
select( user_id, text ) %>%
mutate(
emojis = str_extract_all(text, emoji_regex ) %>% map( not_equal, "-" )
) %>%
filter( map_int(emojis, length) > 0 ) %>%
group_by( user_id ) %>%
summarise(
emojis = map(emojis, ~ flatten_chr(str_split(., "") ) ) %>% flatten_chr() %>% table() %>% list()
) %>%
mutate(
total = map_int(emojis, sum),
distinct = map_int(emojis, length),
emojis = map_chr( emojis, ~ paste( names(.)[ order(., decreasing = TRUE)], collapse = "") )
) %>%
arrange( desc(total) )
left_join( data, lookup_users(data$user_id), by = "user_id" ) %>%
mutate( img = sprintf('<img src="%s" />', profile_image_url ) ) %>%
select( img, name, total, distinct, emojis )
}
I am first there, but that’s not fair because at some point while developping the app I tweeted the list of all the emojis then used so far.
#useR2017 emojis: ๐๐๐ป๐๐ฆ๐๐ฅ๐๐ป๐๐๐๐ป๐๐ค๐๐๐๐ง๐ช๐ฉ๐๐ป๐ถ๐๐๐๐ถ๐๐ค๐ค๐๐๐ฑ๐๐๐ณ๐ป๐บ๐๐๐๐ฏ๐๐๐ฉ๐ฆ๐๐ ๐๐๐ฆ๐ค๐ซ๐ท๐ง๐พ๐๐๐๐๐พ๐ฟ๐พ๐๐๐ธ๐ผ๐ฝ๐๐๐๐๐๐๐ฉ๐ฑ๐ถ๐๐ฌ๐ด๐ต๐ท๐๐๐ช๐๐๐๐๐๐๐๐๐ท๐ต๏ธ๐ป๐พ๐ค๐ค๐ฆ๐๐๐๐๐ฉ๐ฎ๐๐๐๐๐๐ฒ๐คฆ
— Romain Franรงois ๐ฆ (@romain_francois) July 7, 2017
But apart from that, there’s no doubt that Lucy uses lots of emojis:
Here’s an example:
so grateful for inclusion initiatives at #useR2017 including
— ๐๐พ๐ฌ๐::maternity_leave๐คฑ (@LucyStats) July 7, 2017
๐ฏ newbie session
๐ฐ diversity scholarships
๐ผ breast feeding area
๐ถ child care
The app is live here, it may still change and/or move to a new url at some point.
Right now it always uses the data I have extracted between these two instants before it’s too late.
> tweetstorm::useR2017 %>% pull(created_at) %>% range
[1] "2017-06-29 10:07:14 UTC" "2017-07-11 09:55:06 UTC"
I’ll keep updating the dataset for some time as there are likely to have new tweets, e.g. when we go through the slides or videos, …