In this video, I’ll guide you through using the wordcloud2 package in R. Despite a few minor issues, it comes with a range of unique and practical features that you won’t find in other packages. Throughout the tutorial, I’ll showcase various visual options using the package’s demo data. I will then teach you how to effectively extract word frequencies from text data using the tm package and the Friends TV show data as an example.
Subscribe to stay up to date on my latest videos, courses, and content
library(wordcloud2)
library(tidyverse)
# simple example
head(demoFreq)
wordcloud2(data=demoFreq)
# Change the background color
wordcloud2(demoFreq, color='random-light', backgroundColor="black")
# angles
wordcloud2(demoFreq, minRotation = -pi/6, maxRotation = -pi/6, rotateRatio = 1)
# Change the shape:
wordcloud2(demoFreq, size = 0.5, shape = 'star')
wordcloud2(slice_max(demoFreq,order_by = freq, n=50),
size = 0.5, shape = 'circle')
## More complex setup example using Friends scripts #####
library(friends)
library(tm)
# Rachel Green Phoebe Buffay Monica Geller
# Joey Tribbiani Ross Geller Chandler Bing
# filter speaker and isolate dialogue
sentences<- friends %>% filter(speaker=="Ross Geller") %>% select(text)
# use tm package, create a Corpus
docs <- Corpus(VectorSource(sentences))
# use tm to remove punctuation, stopwords, case etc.
docs <- docs %>%
tm_map(removeNumbers) %>%
tm_map(removePunctuation) %>%
tm_map(stripWhitespace)
docs <- tm_map(docs, content_transformer(tolower))
docs <- tm_map(docs, removeWords, stopwords("english"))
# create table of word counts
dtm <- TermDocumentMatrix(docs)
matrix <- as.matrix(dtm)
words <- sort(rowSums(matrix),decreasing=TRUE)
df <- data.frame(word = names(words),freq=words)
#make wordcloud
wordcloud2(slice_max(df, order_by = freq, n=200), size=0.4,
color='random-dark')
# right click in RStudio to save or use webshot package
# install webshot
library(webshot)
webshot::install_phantomjs()
# Make the graph
my_graph <- wordcloud2(demoFreq, size=1.5)
# save it in html
library("htmlwidgets")
saveWidget(my_graph,"tmp.html",selfcontained = F)
# and in png or pdf
webshot("tmp.html","fig_1.png", delay =5, vwidth = 480, vheight=480)