ggblanket is a package of ggplot2 wrapper functions.
The primary objective is to simplify ggplot2 visualisation.
Secondary objectives relate to:
library(dplyr)
library(ggplot2)
library(ggblanket)
library(patchwork)
penguins2 <- palmerpenguins::penguins |>
mutate(sex = stringr::str_to_sentence(sex)) |>
tidyr::drop_na(sex)Each gg_* function wraps a ggplot2
ggplot(aes(...)) function with the applicable ggplot2
geom_*() function. Each gg_* function is named
after the geom_* function they wrap.
The colour and fill aesthetics of ggplot2 are merged into a single
concept represented by the col argument. This argument
means that everything should be coloured according to it, i.e. all
points, lines and polygon interiors.
# ggplot2
p1 <- penguins2 |>
ggplot() +
geom_point(aes(x = flipper_length_mm,
y = body_mass_g,
colour = species))
p2 <- penguins2 |>
ggplot() +
geom_density(aes(x = flipper_length_mm,
fill = species)) +
labs(fill = "Species")
p1 / p2# ggblanket
p1 <- penguins2 |>
gg_point(
x = flipper_length_mm,
y = body_mass_g,
col = species)
p2 <- penguins2 |>
gg_density(
x = flipper_length_mm,
col = species)
p1 / p2The pal argument is used to customise the colours of the
geom. A user can provide a vector of colours to this argument. It can be
named or not. It works in a consistent way - regardless of whether a
col argument is added or not. A named palette can be used
to make individual colours stick to particular values. ggblanket uses
alpha defaults to make outputs look pretty.
# ggplot2
p1 <- penguins2 |>
ggplot() +
geom_histogram(aes(x = body_mass_g),
fill = "#414D6B")
p2 <- penguins2 |>
ggplot() +
geom_jitter(aes(x = species,
y = body_mass_g,
colour = sex)) +
scale_colour_manual(values = c("#1B9E77", "#9E361B"))
p1 / p2# ggblanket
p1 <- penguins2 |>
gg_histogram(
x = body_mass_g,
pal = "#414D6B")
p2 <- penguins2 |>
gg_jitter(
x = species,
y = body_mass_g,
col = sex,
pal = c("#1B9E77", "#9E361B"))
p1 / p2Faceting is treated as if it were an aesthetic. Users just provide an
unquoted variable to facet by. If a single facet (or facet2) variable is
provided, it’ll default to a “wrap” layout. But users can change this
with a facet_layout = "grid" argument.
# ggplot2
penguins2 |>
ggplot() +
geom_violin(aes(x = sex,
y = body_mass_g)) +
facet_wrap(vars(species)) A facet2 argument is also provided for extra
functionality and flexibility. If both facetand
facet2 variables are provided, then it’ll default to a
“grid” layout of facet by facet2. But users
can change this with a facet_layout = "wrap" argument.
# ggplot2
penguins2 |>
ggplot() +
geom_histogram(aes(x = flipper_length_mm)) +
facet_grid(rows = vars(sex), cols = vars(species))Unspecified x, y, and col
titles are converted to sentence case with snakecase::to_sentence. All
titles can be manually changed using the *_title arguments.
The default conversion is intended to make titles sometimes able to be
left as is. Use *_title = "" to remove a title.
# ggplot2
penguins2 |>
ggplot() +
geom_point(aes(x = flipper_length_mm,
y = body_mass_g,
colour = sex)) +
facet_wrap(vars(species)) +
scale_x_continuous(breaks = scales::breaks_pretty(n = 3)) # ggblanket
penguins2 |>
gg_point(
x = flipper_length_mm,
y = body_mass_g,
col = sex,
facet = species)Prefixed arguments are available to customise titles, scales, guides,
and faceting. These prefixes organise the adjustments by whether they
relate to x, y, col or
facet.
# ggplot2
penguins2 |>
ggplot() +
geom_jitter(aes(x = species,
y = body_mass_g,
colour = sex)) +
expand_limits(y = 0) +
scale_x_discrete(labels = \(x) stringr::str_sub(x, 1, 1)) +
scale_y_continuous(breaks = scales::breaks_width(1500),
labels = scales::label_number(big.mark = " "),
expand = expansion(mult = c(0, 0.05)),
trans = "sqrt") +
labs(x = "Species", y = "Body mass (g)", col = NULL) +
theme(legend.position = "top") +
theme(legend.justification = "left") +
scale_colour_manual(values = scales::hue_pal()(2),
guide = ggplot2::guide_legend(title.position = "top"))# ggblanket
penguins2 |>
gg_jitter(
x = species,
y = body_mass_g,
col = sex,
x_labels = \(x) stringr::str_sub(x, 1, 1),
y_include = 0,
y_breaks = scales::breaks_width(1500),
y_labels = scales::label_number(big.mark = " "),
y_expand = expansion(mult = c(0, 0.05)),
y_trans = "sqrt",
y_title = "Body mass (g)",
col_legend_place = "t",
col_title = "")These prefixed arguments work nicely with the Rstudio autocomplete, if users:
gg_* functions.With these settings and use of the pipe, users can type the prefix, and then use the tab and arrow keys to assist in finding and selecting the arguments they need to adjust.
Users can use the theme argument in a gg_*
function for the theme of a plot.
Alternatively, users can set the theme globally using the
ggplot2::theme_set function, such that all subsequent plots
will use this by default.
ggblanket provides two complete ggplot2 theme functions called
light_mode (the default) and dark_mode. The
first argument is the base_size. This changes the size of
all the text to this, except the title is 10% higher and the caption is
10% lower. In quarto, it is likely that users will want to set the
*_mode theme to have a larger base_size
(e.g. ggplot2::theme_set(light_mode(11))).
Note that theme_set(theme_grey()) resets the set theme
for ggplot2 code to theme_grey and for ggblanket gg_*
functions to light_mode(). If you want ggblanket
gg_* functions to default to using
theme_grey(), then you must modify the base_size slightly
(e.g. theme_set(theme_grey(11.01))).
# ggblanket
# theme_set(dark_mode(10))
penguins2 |>
gg_point(
x = flipper_length_mm,
y = body_mass_g,
col = sex,
title = "Penguins body mass by flipper length",
subtitle = "Palmer Archipelago, Antarctica",
caption = "Source: Gorman, 2020",
theme = dark_mode(10))Note that the gg_* function will by default adjust what
gridlines are present and the placement of the legend. Therefore, if you
are providing a theme other than light_mode or
dark_mode, ggblanket works well if this theme has both
vertical and horizontal gridlines. If users want everything adjusted as
per the theme, then they can + their theme onto the plot
instead.
# ggblanket
p1 <- penguins2 |>
gg_point(
x = flipper_length_mm,
y = body_mass_g,
col = sex,
x_breaks = scales::breaks_pretty(n = 3),
theme = theme_grey(),
title = "theme= theme_grey()")
p2 <- penguins2 |>
gg_point(
x = flipper_length_mm,
y = body_mass_g,
col = sex,
x_breaks = scales::breaks_pretty(n = 3),
title = "+ theme_grey()") +
theme_grey()
p1 + p2...The ... argument is placed in the gg_*
function within the wrapped ggplot2::geom_* function. This
means all other arguments in the geom_* function are
available to users. Common arguments from ... to add are
size, linewidth and width.
# ggblanket
penguins2 |>
gg_smooth(
x = flipper_length_mm,
y = body_mass_g,
col = sex,
linewidth = 0.5, #accessed via geom_smooth
level = 0.99) #accessed via geom_smoothWhere the orientation is normal (i.e. vertical):
It does the opposite where the orientation is horizontal.
Note this symmetry approach does not apply: * if a
transformation other than identity or reverse is applied to x or y
scales. * for gg_raster, gg_contour_filled or
gg_density_2d_filled
In some circumstances the ggplot2 approach to default scales may be
preferable. In these cases, users can revert to the ggplot2 approach by
using *_limits = c(NA, NA) and
*_expand = c(0.05, 0.05) (or add
scale_*_continuous()).
# ggplot2
penguins2 |>
group_by(species, sex) |>
summarise(body_mass_g = mean(body_mass_g)) |>
ggplot() +
geom_col(aes(x = body_mass_g,
y = species,
fill = sex),
position = "dodge",
width = 0.66)# ggblanket
penguins2 |>
group_by(species, sex) |>
summarise(body_mass_g = mean(body_mass_g)) |>
gg_col(
x = body_mass_g,
y = species,
col = sex,
position = "dodge",
width = 0.66)Sometimes with small plots or faceted plots etc, the labels can be
too squashed. Making the breaks width bigger can waste space, due to the
afore-mentioned approach of ggblanket to making pretty scales. An
alternative approach is to use the str_keep_seq function
with the *_labels arguments to only keep every 2nd (or nth)
label.
# ggblanket
penguins2 |>
group_by(species, sex) |>
summarise(body_mass_g = mean(body_mass_g)) |>
ungroup() |>
gg_col(
y = body_mass_g,
x = species,
col = sex,
position = "dodge",
width = 0.5,
x_labels = \(x) stringr::str_sub(x, 1, 1),
y_labels = \(x) str_keep_seq(x),
title = "Keep every 2nd label",
theme = light_mode(title_face = "plain"))Users can make plots with multiple layers with ggblanket by adding on
ggplot2::geom_* layers.
The gg_* function puts the aesthetic variables
(i.e. x, y, col) within the
wrapped ggplot function. Therefore, these aesthetics will
inherit to any subsequent layers added.
Where there are multiple geom layers in a desired plot, users should
determine the gg_* function with care:
gg_* function should be appropriate to be
the bottom layer of the plot. This is because the geoms will plot in
order.# ggblanket + ggplot2
p1 <- ggplot2::economics |>
slice_min(order_by = date, n = 10) |>
gg_line(
x = date,
y = unemploy,
pal = guardian()[1],
x_title = "",
y_title = "Unemployment",
y_include = 0,
linewidth = 1,
x_breaks = scales::breaks_width("3 months"),
title = "gg_line + geom_point",
theme = light_mode(title_face = "plain")) +
geom_point(colour = guardian()[2])
p2 <- ggplot2::economics |>
slice_min(order_by = date, n = 10) |>
gg_point(
x = date,
y = unemploy,
pal = guardian()[2],
x_title = "",
y_title = "Unemployment",
y_include = 0,
x_breaks = scales::breaks_width("3 months"),
title = "gg_point + geom_line",
theme = light_mode(title_face = "plain")) +
geom_line(colour = guardian()[1], linewidth = 1)
p1 + p2If some geom layers have a col aesthetic and some do
not, then a gg_* function should be chosen that
has a col argument in it. This will enable ggblanket legend
placement and access to col_* arguments. It is also a more
reliable approach. If later layers do not require the col
aesthetic, then the inheit.aes = FALSE argument should be
used.
In some situations, gg_blank may be
required.
If users are building a horizontal plot that includes multiple
geoms, it is recommended that users build the plot vertically with
ggblanket - and then use ggplot2::coord_flip to make it
horizontally.
Users need to ensure that the scales built by their gg_*
function are appropriate for subsequent layers. Plot scales are built by
the gg_* function based on the data,
x, y, *_limits,
*_include, stat, position and
coord arguments in the gg_* function.
# ggblanket + ggplot2
d <- penguins2 |>
group_by(species) |>
summarise(body_mass_g = mean(body_mass_g)) |>
mutate(lower = body_mass_g * 0.95) |>
mutate(upper = body_mass_g * 1.2)
p1 <- d |>
gg_col(
x = species,
y = body_mass_g,
col = species,
width = 0.75,
y_include = c(0, max(d$upper)),
y_labels = \(x) x / 1000,
y_title = "Body mass kg",
col_legend_place = "n") +
geom_errorbar(aes(ymin = lower, ymax = upper),
colour = "black",
width = 0.1) +
coord_flip()
p2 <- d |>
gg_col(
x = species,
y = body_mass_g,
col = species,
colour = "#d3d3d3",
fill = "#d3d3d3",
width = 0.75,
y_include = c(0, max(d$upper)),
y_labels = \(x) x / 1000,
y_title = "Body mass kg",
col_legend_place = "n") +
geom_errorbar(aes(ymin = lower, ymax = upper),
width = 0.1) +
coord_flip()
p1 / p2ggblanket requires unquoted variables only for x,
y, col, facet and
facet2. You cannot wrap these in a function. Instead you
need to apply the function to the relevant variable in the data prior to
plotting. For example, reordering or reversing a factor or dropping
NAs.
p1 <- diamonds |>
count(color) |>
gg_col(
x = n,
y = color,
width = 0.75,
x_labels = \(x) x / 1000,
x_title = "Count (thousands)",
title = "Default y",
theme = light_mode(title_face = "plain")
)
p2 <- diamonds |>
count(color) |>
mutate(color = forcats::fct_rev(color)) |>
gg_col(
x = n,
y = color,
width = 0.75,
x_labels = \(x) x / 1000,
x_title = "Count (thousands)",
title = "Reverse y",
theme = light_mode(title_face = "plain")
)
p3 <- diamonds |>
count(color) |>
mutate(color = forcats::fct_reorder(color, n)) |>
gg_col(
x = n,
y = color,
width = 0.75,
x_labels = \(x) x / 1000,
x_title = "Count (thousands)",
title = "Reordered y ascending by x",
theme = light_mode(title_face = "plain")
)
p4 <- diamonds |>
count(color) |>
mutate(color = color |>
forcats::fct_reorder(n) |>
forcats::fct_rev()) |>
gg_col(
x = n,
y = color,
width = 0.75,
x_labels = \(x) x / 1000,
x_title = "Count (thousands)",
title = "Reordered y decending by x",
theme = light_mode(title_face = "plain")
)
(p1 + p2) / (p3 + p4)ggblanket keeps unused factor levels in the plot. If users wish to drop unused levels they should likewise do it in the data prior to plotting.
p1 <- diamonds |>
count(color) |>
filter(color %in% c("E", "G", "I")) |>
gg_point(
x = n,
y = color,
x_labels = \(x) x / 1000,
x_title = "Count (thousands)",
title = "A factor filtered",
theme = light_mode(title_face = "plain"))
p2 <- diamonds |>
count(color) |>
filter(color %in% c("E", "G", "I")) |>
mutate(color = forcats::fct_drop(color)) |>
gg_point(
x = n,
y = color,
x_labels = \(x) x / 1000,
x_title = "Count (thousands)",
title = "A factor filtered & unused levels dropped",
theme = light_mode(title_face = "plain"))
p1 + p2ggblanket uses different defaults for colouring. The default
pal is:
#357BA2 or mako(9)[5] for no
col variableggblanket::guardian for a discrete col
variable with 4 or less levels (or unique values if a character). This
palette is colourblind safe.scales::hue_pal for a discrete col
variable with 5 or more levels (or unique values if not ordered).viridis::mako reversed for a continuous
col variable"#bebebe" or grey for NAggblanket uses different alpha defaults for the
different gg_* functions. Polygons that generally have no
gap or overlap default to 1: gg_bin_2d,
gg_contour_filled, gg_density_2d_filled,
gg_hex - as well as gg_sf for polygons with a
col aesthetic. Polygons that generally overlap default to
0.5: gg_density. Polygons that generally have key lines
within them also default to 0.5: gg_boxplot,
gg_crossbar, gg_ribbon and
gg_smooth. gg_label defaults to 0.05.
gg_blank has no alpha argument. Other polygons default to
0.9: gg_area, gg_bar, gg_col,
gg_histogram, gg_polygon,
gg_rect, gg_tile and gg_violin.
For all other contexts, alpha defaults to 1.
By default, ggblanket keeps values outside of the limits
(*_oob = scales::oob_keep) in calculating the geoms and
scales to plot. It also does not clip anything outside the
cartesian coordinate space by default
(coord = ggplot2::coord_cartesian(clip = "off")).
ggplot2 by default drops values outside of the limits in calculating
the geoms and scales to plot (scales::oob_censor), and
clips anything outside the cartesian coordinate space
(coord = ggplot2::coord_cartesian(clip = "on")).
Users should be particularly careful when setting limits for stats
other than identity.
p1 <- economics |>
gg_smooth(
x = date,
y = unemploy,
y_labels = \(x) str_keep_seq(x),
title = "No x_limits set",
theme = light_mode(title_face = "plain")) +
geom_vline(xintercept = c(lubridate::ymd("1985-01-01", "1995-01-01")),
col = guardian(n = 1),
linetype = 3) +
geom_point(col = guardian(n = 1), alpha = 0.05)
p2 <- economics |>
gg_smooth(
x = date,
y = unemploy,
x_limits = c(lubridate::ymd("1985-01-01", "1995-01-01")),
x_labels = \(x) stringr::str_sub(x, 3, 4),
y_labels = \(x) str_keep_seq(x),
title = "x_limits set",
theme = light_mode(title_face = "plain")) +
geom_point(col = guardian(n = 1), alpha = 0.1)
p3 <- economics |>
gg_smooth(
x = date,
y = unemploy,
x_limits = c(lubridate::ymd("1985-01-01", "1995-01-01")),
x_labels = \(x) stringr::str_sub(x, 3, 4),
y_labels = \(x) str_keep_seq(x),
coord = coord_cartesian(clip = "on"),
title = "x_limits set & cartesian space clipped",
theme = light_mode(title_face = "plain")) +
geom_point(col = guardian(n = 1), alpha = 0.1)
p4 <- economics |>
gg_smooth(
x = date,
y = unemploy,
x_limits = c(lubridate::ymd("1985-01-01", "1995-01-01")),
x_labels = \(x) stringr::str_sub(x, 3, 4),
x_oob = scales::oob_censor,
y_labels = \(x) str_keep_seq(x),
title = "x_limits set & x_oob censored",
theme = light_mode(title_face = "plain")) +
geom_point(col = guardian(n = 1), alpha = 0.1)
p5 <- economics |>
filter(between(date, lubridate::ymd("1985-01-01"), lubridate::ymd("1995-01-01"))) |>
gg_smooth(
x = date,
y = unemploy,
x_labels = \(x) stringr::str_sub(x, 3, 4),
y_labels = \(x) str_keep_seq(x),
title = "x data filtered",
theme = light_mode(title_face = "plain")) +
geom_point(col = guardian(n = 1), alpha = 0.1)
p1 / (p2 + p3) / (p4 + p5) ggblanket is much slower than ggplot2 in computational speed, due to the hack that underlies ggblanket to make its pretty continuous scales.
bench::mark({
penguins2 |>
gg_point(x = flipper_length_mm,
y = body_mass_g,
col = species)
}, iterations = 10)
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch> <bch:> <dbl> <bch:byt> <dbl>
#> 1 { gg_point(penguins2, x = flipper_l… 121ms 123ms 8.13 804KB 8.13bench::mark({
penguins2 |>
ggplot() +
geom_point(aes(x = flipper_length_mm, y = body_mass_g, colour = species))
}, iterations = 10)
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:> <bch:> <dbl> <bch:byt> <dbl>
#> 1 { ggplot(penguins2) + geom_point(a… 2.12ms 2.18ms 439. 12.6KB 0*_title is equivalent to
ggplot2::labs(* = ...) or
ggplot2::scale_*(name = ...).TRUE always
comes before FALSE.*_include works in a similar way to
ggplot2::expand_limits(* = ...).See the ggblanket website for further information, including articles and function reference.