The yarrr package (0.0.8) is (finally!) on CRAN

Great news R pirates! The yarrr package, which contains the pirateplot, has now been updated to version 0.0.8 and is up on CRAN (after hiding in plain sight on GitHub).┬áLet’s install the latest version and go over some of the updates:

[code language=”r”]
install.packages("yarrr") # Install package from CRAN
library("yarrr") # Load the package
yarrr.guide() # Open the package guide
[/code]

The most important function in the yarrr package is pirateplot(). What the heck is a pirateplot? A pirateplot is a modern way of visualising the relationship between a categorical independent variable, and a continuous dependent variable. Unlike traditional plotting methods, like barplots and boxplots, a pirateplot is an RDI plotting trifecta which presents Raw data (all data as points), Descriptive statistics (as a horizontal line at the mean — or any other function you wish), and Inferential statistics (95% Bayesian Highest Density Intervals, and smoothed densities).

pirateplot-elements

For a full guide to the package, check out the package guide at CRAN here. For now, here are some examples of pirateplots showing off some the package updates.

Up to 3 IVs

You can now include up to three independent variables in your pirateplot. The first IV is presented as adjacent beans, the second is presented in different groups of beans in the same plot, and the third IV is shown in separate plots.

Here is a pirateplot of the heights of pirates based on three separate IVs: headband (whether the pirate wears a headband or not), sex, and eyepatch (whether the pirate wears an eye patch or not):

[code language=”r”]
pirateplot(formula = height ~ sex + headband + eyepatch,
point.o = .1,
data = pirates)
[/code]

threeivpp

Here, we can see that male pirates tend to be the tallest, but there there doesn’t seem to be a difference between those who wear headbands or not, and those who have eye patches or not.

New color palettes

The updated package has a few fun new color palettes contained in the piratepal() function. The first, called ‘xmen’, is inspired by my 90s Saturday morning cartoon nostalgia.

[code language=”r”]
# Display the xmen palette
piratepal(palette = "xmen",
trans = .1, # Slightly transparent colors
plot.result = TRUE)
[/code]

xmen_display

Here, I’ll use the xmen palette to plot the distribution of the weights of chickens over time (if someone has a more suitable dataset for the xmen palette let me know!):

[code language=”r”]
pirateplot(formula = weight ~ Time,
data = ChickWeight,
main = "Weights of chickens by Time",
pal = "xmen",
gl.col = "gray")

mtext(text = "Using the xmen palette!",
side = 3,
font = 3)

mtext(text = "*The mean and variance of chicken\nweights tend to increase over time.",
side = 1,
adj = 1,
line = 3.5,
font = 3,
cex = .7)
[/code]

xmen_chikens

The second palette called “pony” is inspired by the Bronys in our IT department.

[code language=”r”]
# Display the pony palette
piratepal(palette = "pony",
trans = .1, # Slightly transparent colors
plot.result = TRUE)
[/code]

pony_image

Here, I’ll plot the distribution of the lengths of movies as a function of their MPAA ratings (where G is for suitable for children, and R is suitable for adults) using the pony palette:

[code language=”r”]
pirateplot(formula= time ~ rating,
data = subset(movies, time > 0 & rating %in% c("G", "PG", "PG-13", "R")),
pal = "pony",
point.o = .05,
bean.o = 1,
main = "Movie times by rating",
bean.lwd = 2,
gl.col = "gray")

mtext(text = "Using the pony palette!",
side = 3,
font = 3)

mtext(text = "*Movies rated for children\n(G and PG) tend to be longer \nthan those rated for adults",
side = 1,
adj = 1,
font = 3,
line = 3.5,
cex = .7)
[/code]

pony_times

I have to be honest, the pony palette colors are not terribly well suited for this pirateplot — but I think they look better in a basic scatterplot. Because the piratepal function returns a vector of colors (when plot.result = F), you can also use it in other plots. Here, I’ll use the pony palette in a scatterplot:

[code language=”r”]
set.seed(100) # for replicability
x <- rnorm(100, mean = 10, sd = 1)
y <- x + rnorm(100, mean = 0, sd = 1)
point.sizes <- runif(100, min = .2, max = 2) # Just for fun

plot(x, y,
main = "Scatterplot with the pony palette",
pch = 21,
bg = piratepal("pony", trans = .1),
col = "white",
bty = "n",
cex = point.sizes)

grid() # Add gridlines
[/code]

ponyscatter

To see all of the palettes (including those inspired by movies and a transit map of Basel), just run the function with “all” as the main argument

[code language=”r”]
piratepal(palette = "all")
[/code]

Of course, if you find that these color palettes give you a headache, you can always set a pirateplot to grayscale (or any other color), by specifying a single color in the palette argument. Here, I’ll create a grayscale pirateplot showing the distribution of movie budgets by their creative type:

[code language=”r”]
pirateplot(formula = budget ~ creative.type,
data = subset(movies, budget > 0 &
creative.type %in% c("Multiple Creative Types", "Factual") == FALSE),
point.o = .02,
xlab = "Movie Creative Type",
main = "Movie budgets (in millions) by rating",
gl.col = "gray",
pal = "black")

mtext("Using a grayscale pirateplot",
side = 3,
font = 3)

mtext("*Superhero movies tend to have the highest budgets\n…by far!",
side = 1, adj = 1, line = 3,
cex = .8, font = 3)
[/code]

moviebudgetpp

Looks like super hero movies have the highest budgets…by far!

And again, to get more tips on how to customise your palettes and pirateplots, check out the main package guide at https://cran.r-project.org/web/packages/yarrr/vignettes/guide.html, or by running the following code:

[code language=”r”]
yarrr.guide() # Open the yarrr package guide
[/code]

Acknowledgements and Comments

– The pirateplot is largely inspired by the great beanplot package by Peter Kampstra.
– Bayesian 95% HDIs are calculated using the truly amazing BayesFactor package by Richard Morey [Note: a previous version of this post incorrectly called Richard “Brian” — I blame lack of caffeine].
– The latest developer version of yarrr is always available at https://github.com/ndphillips/yarrr. Please post any bugs, issues, or feature requests at https://github.com/ndphillips/yarrr/issues