A common structure for a ggplot()
command with a fair
amount of customization might be:
scale_axisname_type
scale_x_continuous
for a continuous x-axisscale_x_discrete
for a discrete onescale_y_continuous
,
scale_color_continuous
, scale_color_gradient
,
scale_fill_discrete
, and so on and so onlimits
)scale_x_manual
for discrete plots will let you set
things by handx
is different companies and you want to
pick each company’s brand color for its bardata <- tibble(category = c('Apple','Banana','Carrot','Apple','Banana','Carrot'),
person = c('Me','Me','Me','You','You','You'),
quality = c(.06,.04,.03,.01,.06,.03))
ggplot(data, aes(x = person, y = quality, fill = category)) + geom_col(position = 'dodge')
library(scales)
ggplot(data, aes(x = person, y = quality, fill = category)) + geom_col(position = 'dodge') +
scale_y_continuous(labels = label_percent(), limits = c(0,.1)) +
scale_x_discrete(position = 'top') +
scale_fill_manual(values = c('Apple'='red','Banana'='yellow','Carrot'='orange'))
limits
in the scale
function can be used
to specify the vector of values to be included, or as a range for
continuous variableslabels
is also handy.
labels = c('Red Apple',
'Yellow Banana','Orange Carrot')
would relabel the legend
(or axis labels)scale_x_continuous(labels = scales::label_dollar())
would
put it in dollar terms. More on this in a momentscale_x_date(labels = function(x)
paste(month.abb[month(x)],year(x)))
scale_color_
/scale_fill_
functions that solely
exist to help with this!Especially useful are:
scale_color_gradient()
for gradient scales (or
_gradient2()
for diverging scales with a “middle” in them),
scale_color_viridis()
also has some great gradient scales
(either discrete or continuous!)scale_color_brewer()
/scale_fill_brewer()
functions for discrete values, or _distiller()
for
continuous values, or _fermenter()
for binnedggplot(data, aes(x = person, y = quality, fill = category)) + geom_col(position = 'dodge') +
scale_y_continuous(labels = label_percent(), limits = c(0,.1)) +
scale_x_discrete(position = 'top') +
scale_fill_brewer(palette = 'Dark2')
ggplot(data, aes(x = person, y = quality, group = category, fill = quality)) + geom_col(position = 'dodge') +
scale_y_continuous(labels = label_percent(), limits = c(0,.1)) +
scale_x_discrete(position = 'top') +
scale_fill_viridis_c()
ggplot(data, aes(x = person, y = quality, group = category, fill = quality)) + geom_col(position = 'dodge') +
scale_y_continuous(labels = label_percent(), limits = c(0,.1)) +
scale_x_discrete(position = 'top') +
scale_fill_gradient2(midpoint = .03)
scale_something_continuous
entries have a
trans
option, set to date
, log
,
probability
, reciprocal
, sqrt
,
reverse
, etc. etc. to perform that transformation before
plottingscale_x_log10
or
scale_y_reverse
for some common transformsscale_something_binned()
takes a continuous value and
puts it in bins, handy sometimes for simplificationggplot(mtcars, aes(x = mpg, y = hp, color = wt)) +
geom_point() +
scale_x_log10() +
scale_y_continuous(trans='reverse') +
scale_color_binned()
When to use log scales?
Scale functions have a labels=
option.
labels = c('A','B')
gives the categories the names A and B,
in that order (tip: check the order first)Two main types of functions in scales:
dollar()
:
dollar(10)
creates $10 (NOTE: handy sometimes in RMarkdown
text! Also note this creates text, not numbers, so don’t use them in
aes()
unless you want the variable to be a string)label_dollar()
designed to slot
directly into the labels=
argument.
scale_y_continuous(labels = label_dollar())
turns all your
y-axis labels into the dollar equivalentggplot(data, aes(x = person, y = quality, fill = category)) + geom_col(position = 'dodge') +
scale_y_continuous(labels = label_percent(), limits = c(0,.1))
The label_
functions have lots of options! You can set
the accuracy
(precision), decide how to break up big
numbers (big.mark
) or scale things down to, say, thousands!
(scale=1/1000, suffix = 'k'
)
data(gapminder, package = 'gapminder')
ggplot(gapminder, aes(x = gdpPercap, y = lifeExp)) + geom_point() +
scale_x_log10(labels = label_dollar(accuracy = 1, scale = 1/1000, suffix = 'k'))
help(whatever)
before using
somethingscales_
or whatever and see what
pops up in autocomplete in RStudiodata(mtcars)
mtcars <- mtcars %>% mutate(CarName = row.names(mtcars))
ggplot(mtcars, aes(x = mpg, y = hp, color = wt)) + geom_point() +
scale_x_log10() + scale_y_reverse() + scale_color_binned() +
labs(x = 'Miles per Gallon', y = 'Horsepower', color = 'Car Weight',
title = 'Title', subtitle = 'Subtitle', caption = 'Caption')
ggplot(mtcars, aes(x = mpg, y = hp, color = wt)) + geom_point() +
scale_x_log10() + scale_y_reverse() + scale_color_binned() +
labs(x = 'Miles per Gallon', y = 'Horsepower', color = 'Car Weight',
title = 'Title', subtitle = 'Subtitle', caption = 'Caption') +
theme(legend.position = c(.9,.3), legend.box.background = element_rect(color='black'))
ggplot(mtcars, aes(x = mpg, y = hp, color = wt)) + geom_point() +
scale_x_log10() + scale_y_reverse() + scale_color_binned() +
labs(x = 'Miles per Gallon', y = 'Horsepower', color = 'Car Weight') +
guides(color = 'none')
geom_smooth
best fit
line over top, for example)ggplot(mtcars, aes(x = mpg, y = hp, color = wt)) + geom_point() +
scale_x_log10() + scale_y_reverse() + scale_color_binned() +
labs(x = 'Miles per Gallon', y = 'Horsepower', color = 'Car Weight') +
geom_smooth(method='lm', se = FALSE)
ggplot(mtcars, aes(x = mpg, y = hp, color = wt)) + geom_point() +
scale_x_log10() + scale_y_reverse() + scale_color_binned() +
labs(x = 'Miles per Gallon', y = 'Horsepower', color = 'Car Weight') +
geom_text_repel(data = mtcars %>% slice(1:5),aes(label = CarName),hjust=-1)
ggplot(mtcars, aes(x = mpg, y = hp)) + geom_point() +
facet_wrap('cyl') +
labs(x = 'Miles per Gallon', y = 'Horsepower', title = 'Horsepower vs. MPG by Cylinders')
library(ggforce)
ggplot(iris, aes(Petal.Length, Petal.Width, colour = Species)) +
geom_point() +
facet_zoom(x = Species == 'versicolor')
+
different geometries, you’re
almost always better off doing a pivot_longer()
first and
just using aes()
.