Raincloud Plots in R

The raincloud plot is a data visualization which is a combination of a density curve of the distribution, and box and whisker plot, and a histogram style dot plot. By showing the density curves these graphs provide the viewer with a much better picture of the data than a box plot by itself. It is named a raincloud plot because the density curve with the dots falling below look like rain falling from a cloud. It works best for smaller datasets where there aren’t too many observations to overwhelm the viewer with too many dots on the dot plot.

Raincloud Plot in R using the Iris dataset

However, when you have a larger dataset, some of the same code can be used to produce just the density curves and box plots.

In the following video I demonstrate how to produce each of these graphs, talking through the various pieces of code (which is all provided below).


Subscribe below to get updates on my latest videos, courses, and other useful information.


library(tidyverse)
library(ggdist)
library(ggthemes)

# small dataset example

iris %>% 
  ggplot(aes(x = factor(Species), y = Petal.Length, fill = factor(Species))) +
# add half-violin from {ggdist} package
  stat_halfeye(
    # adjust bandwidth
    adjust = 0.5,
    # move to the right
    justification = -0.2,
    # remove the slab interval
    .width = 0,
    point_colour = NA) +

# boxplot
    geom_boxplot(
    width = 0.12,
    # purple outlier points
    outlier.color = "purple",
    alpha = 0.5) +

# dots
    stat_dots(dotsize=0.5,
    # ploting on left side
    side = "left",
    # adjusting position
    justification = 1.1,
    # adjust grouping (binning) of observations
    binwidth = 0.1
  ) +

# Themes and Labels
  scale_fill_fivethirtyeight() +
  theme_fivethirtyeight() +
  labs(
    title = "Distribution of Iris Petal Length by Species (Raincloud Plot)",
    fill = "Species"
  ) +

# Horizontal to vertical  
coord_flip()



## ------------------------------------
# Large dataset, exclude dots

diamonds %>% 
  ggplot(aes(x = factor(cut), y = price, fill = factor(cut))) +
  # add half-violin from {ggdist} package
  stat_halfeye(
    # adjust bandwidth
    adjust = 0.5,
    # move to the right
    justification = -0.2,
    # remove the slab interval
    .width = 0,
    point_colour = NA) +
  
  # boxplot
  geom_boxplot(
    width = 0.12,
    # exclude outlier points
    outlier.color = NA,
    alpha = 0.5) +
  
  # Labels
  labs(
    title = "Distribution of Diamond Prices by Cut",
    x = "Cut",
    y = "Price",
    fill = "Cut"
  ) +
  
  # Horizontal to vertical  
  coord_flip()