A common structure for a ggplot() command with a fair
amount of customization might be:
scale_axisname_typescale_x_continuous for a continuous x-axisscale_x_discrete for a discrete onescale_y_continuous,
scale_color_continuous, scale_color_gradient,
scale_fill_discrete, and so on and so onlimits)scale_x_manual for discrete plots will let you set
things by handx is different companies and you want to
pick each company’s brand color for its bardata <- tibble(category = c('Apple','Banana','Carrot','Apple','Banana','Carrot'),
person = c('Me','Me','Me','You','You','You'),
quality = c(.06,.04,.03,.01,.06,.03))
ggplot(data, aes(x = person, y = quality, fill = category)) + geom_col(position = 'dodge')library(scales)
ggplot(data, aes(x = person, y = quality, fill = category)) + geom_col(position = 'dodge') +
scale_y_continuous(labels = label_percent(), limits = c(0,.1)) +
scale_x_discrete(position = 'top') +
scale_fill_manual(values = c('Apple'='red','Banana'='yellow','Carrot'='orange'))limits in the scale function can be used
to specify the vector of values to be included, or as a range for
continuous variableslabels is also handy.
labels = c('Red Apple',
'Yellow Banana','Orange Carrot') would relabel the legend
(or axis labels)scale_x_continuous(labels = scales::label_dollar()) would
put it in dollar terms. More on this in a momentscale_x_date(labels = function(x)
paste(month.abb[month(x)],year(x)))scale_color_/scale_fill_ functions that solely
exist to help with this!Especially useful are:
scale_color_gradient() for gradient scales (or
_gradient2() for diverging scales with a “middle” in them),
scale_color_viridis() also has some great gradient scales
(either discrete or continuous!)scale_color_brewer()/scale_fill_brewer()
functions for discrete values, or _distiller() for
continuous values, or _fermenter() for binnedggplot(data, aes(x = person, y = quality, fill = category)) + geom_col(position = 'dodge') +
scale_y_continuous(labels = label_percent(), limits = c(0,.1)) +
scale_x_discrete(position = 'top') +
scale_fill_brewer(palette = 'Dark2')ggplot(data, aes(x = person, y = quality, group = category, fill = quality)) + geom_col(position = 'dodge') +
scale_y_continuous(labels = label_percent(), limits = c(0,.1)) +
scale_x_discrete(position = 'top') +
scale_fill_viridis_c()ggplot(data, aes(x = person, y = quality, group = category, fill = quality)) + geom_col(position = 'dodge') +
scale_y_continuous(labels = label_percent(), limits = c(0,.1)) +
scale_x_discrete(position = 'top') +
scale_fill_gradient2(midpoint = .03)scale_something_continuous entries have a
trans option, set to date, log,
probability, reciprocal, sqrt,
reverse, etc. etc. to perform that transformation before
plottingscale_x_log10 or
scale_y_reverse for some common transformsscale_something_binned() takes a continuous value and
puts it in bins, handy sometimes for simplificationggplot(mtcars, aes(x = mpg, y = hp, color = wt)) +
geom_point() +
scale_x_log10() +
scale_y_continuous(trans='reverse') +
scale_color_binned()When to use log scales?
Scale functions have a labels= option.
labels = c('A','B') gives the categories the names A and B,
in that order (tip: check the order first)Two main types of functions in scales:
dollar():
dollar(10) creates $10 (NOTE: handy sometimes in RMarkdown
text! Also note this creates text, not numbers, so don’t use them in
aes() unless you want the variable to be a string)label_dollar() designed to slot
directly into the labels= argument.
scale_y_continuous(labels = label_dollar()) turns all your
y-axis labels into the dollar equivalentggplot(data, aes(x = person, y = quality, fill = category)) + geom_col(position = 'dodge') +
scale_y_continuous(labels = label_percent(), limits = c(0,.1))The label_ functions have lots of options! You can set
the accuracy (precision), decide how to break up big
numbers (big.mark) or scale things down to, say, thousands!
(scale=1/1000, suffix = 'k')
data(gapminder, package = 'gapminder')
ggplot(gapminder, aes(x = gdpPercap, y = lifeExp)) + geom_point() +
scale_x_log10(labels = label_dollar(accuracy = 1, scale = 1/1000, suffix = 'k'))help(whatever) before using
somethingscales_ or whatever and see what
pops up in autocomplete in RStudiodata(mtcars)
mtcars <- mtcars %>% mutate(CarName = row.names(mtcars))
ggplot(mtcars, aes(x = mpg, y = hp, color = wt)) + geom_point() +
scale_x_log10() + scale_y_reverse() + scale_color_binned() +
labs(x = 'Miles per Gallon', y = 'Horsepower', color = 'Car Weight',
title = 'Title', subtitle = 'Subtitle', caption = 'Caption')ggplot(mtcars, aes(x = mpg, y = hp, color = wt)) + geom_point() +
scale_x_log10() + scale_y_reverse() + scale_color_binned() +
labs(x = 'Miles per Gallon', y = 'Horsepower', color = 'Car Weight',
title = 'Title', subtitle = 'Subtitle', caption = 'Caption') +
theme(legend.position = c(.9,.3), legend.box.background = element_rect(color='black'))ggplot(mtcars, aes(x = mpg, y = hp, color = wt)) + geom_point() +
scale_x_log10() + scale_y_reverse() + scale_color_binned() +
labs(x = 'Miles per Gallon', y = 'Horsepower', color = 'Car Weight') +
guides(color = 'none')geom_smooth best fit
line over top, for example)ggplot(mtcars, aes(x = mpg, y = hp, color = wt)) + geom_point() +
scale_x_log10() + scale_y_reverse() + scale_color_binned() +
labs(x = 'Miles per Gallon', y = 'Horsepower', color = 'Car Weight') +
geom_smooth(method='lm', se = FALSE)ggplot(mtcars, aes(x = mpg, y = hp, color = wt)) + geom_point() +
scale_x_log10() + scale_y_reverse() + scale_color_binned() +
labs(x = 'Miles per Gallon', y = 'Horsepower', color = 'Car Weight') +
geom_text_repel(data = mtcars %>% slice(1:5),aes(label = CarName),hjust=-1)ggplot(mtcars, aes(x = mpg, y = hp)) + geom_point() +
facet_wrap('cyl') +
labs(x = 'Miles per Gallon', y = 'Horsepower', title = 'Horsepower vs. MPG by Cylinders')library(ggforce)
ggplot(iris, aes(Petal.Length, Petal.Width, colour = Species)) +
geom_point() +
facet_zoom(x = Species == 'versicolor')+ different geometries, you’re
almost always better off doing a pivot_longer() first and
just using aes().