vignettes/ggplot2loon.Rmd
ggplot2loon.Rmd
The ggplot2
graphics package (part of the
tidyverse
package collection) uses the base
grid
graphics package to produce publication quality
graphics for data analysis. Based on a grammar for graphics,
ggplot2
also provides a lot of functionality
(e.g. facet
s) that can be extremely useful in data
analysis.
The loon
graphics package provides
interactive graphics especially valuable in any
exploratory data analysis. This includes programmatic
and direct manipulation of the visualizations to effect interactive
identification, zooming, panning, and linking between any number of
displays. Of course, loon
also provides publication quality
static graphics in grid
via loon’s functions
grid.loon()
and loonGrob()
. In particular, the
standard R
plot()
function has been
specialized to produce a grid
version of any
loon
plot.
The loon.ggplot package brings both these packages
together. Data analysts who value the ease with which
ggplot2
can create meaningful graphics can now turn
these ggplot
s into interactive loon
plots for more dynamic interaction with their data.
ggplot2loon()
Consider the mtcars
data set and suppose we draw a
scatterplot of the mileage mpg
(miles per US gallon) versus
the weight of the car wt
in thousands of pounds. In
ggplot2
this would be constructed as
p1 <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
and displayed by printing the "ggplot"
data structure
p1
as
p1
We might also display a histogram of some other variate, say the
engine’s horsepower hp
. In ggplot2
this would
be constructed as
h1 <- ggplot(mtcars, aes(hp)) + geom_histogram(bins = 30)
and displayed as
h1
ggplot2loon()
and
grid.loon()
or plot.loon()
Using the loon.ggplot
package, the "ggplot"
data structures p1
and h1
can be
turned into interactive loon plots using the
ggplot2loon()
function:
library(loon.ggplot)
l_p1 <- ggplot2loon(p1) # the scatterplot
l_h1 <- ggplot2loon(h1) # the histogram
These loon plots should look very much (though not exactly) like the
ggplot from which it was constructed. Using grid.loon()
grid
graphics versions of the loon plots can be created for
static display, printing, and incorporation into documents:
grid.loon(l_p1)
Alternatively, and more simply, we can use the usual
plot
function in R
which has a method for
loon
objects (plot
is essentially “syntactic
sugar” since it simply calls grid.loon()
)
plot(l_h1) # equivalent to grid.loon(l_h1)
Unlike p1
and h1
, the loon counterparts
l_p1
and l_h1
are both
interactive; their displays can be changed by
direct manipulation using the mouse and/or each loon
plot’s loon inspector.
For example, points in the loon scatterplot l_p1
, can be
selected with a mouse (<SHIFT>
select for multiple
selection) and their glyph changed in shape or color, for example.
Scrolling over the plot causes zooming and
<RIGHT-BUTTON>
(or
<SECONDARY-BUTTON>
, depending on your mouse setup)
can be used to pan the plot (the same can be done via the “World View”
of the inspector).
If, for example, the itemLabels
box is checked in the
inspector, then hovering over any point will pop a little “tool tip”
identifying the point.
Similarly, the loon histogram l_h1
has many interactions
and a specialized loon inspector. For example, the histogram can be
changed from frequency
to density
in the
inspector and the “bin handle” turned on and off; the latter can be
directly manipulated to change the size and location of the bins
whenever it appears on the histogram plot.
Many, many, more interactions are possible. You might want to
experiment with the plot and the inspector to see some of the
possibilities. (Recommended: explore the vignettes
given in the loon
package).
One of the basic features of interactive statistical graphics is the ability to link displays. In loon this is most easily accomplished via the loon inspector.
To see the effect, with the mouse select one of the loon plots
l_p1
or l_h1
and then select the inspector. On
the inspector, type the text mtcars
into the text input box
which appears next to “linking group”; then hit return/enter. As a
result, the text appearing there will be mtcars [1 linked]
;
this shows that the selected plot is participating in the linking group
named “mtcars” and that it is presently the only one so linked.
Now select the other plot l_h1
or l_p1
. On
the loon inspector for this plot, you can again enter “mtcars” as this
plot’s linking group (this can be done as before or selected from the
menu accessible from the down arrow next to the linking group. As soon
as the second plot is linked, the user is prompted to choose how to
synchronize the linked features of the new plot to the previous plot.
The choice is between pushing the display features from the
current plot to the rest of the plots in the group and pulling
the display features from the group to change those of the current plot.
The resulting linking group entry should now show
mtcars [2 linked]
.
Once the both plots are in the same linking group (i.e. “mtcars”),
selections in one plot will be reflected in the other. Try this. To see
colour changes in the histogram, the box stacked colors
must be checked.
Linking is one way to condition on the values in one plot and observe their effect in the others. Try this with “brushing” in either plot.
See the vignette Linking
for more.
The loon plot, l_p1
is itself a data structure of class
"l_plot"
and "loon"
class(l_p1)
l_p1
# [1] "l_plot" "loon"
# [1] ".l0.ggplot.plot"
# attr(,"class")
# [1] "l_plot" "loon"
whose value is a string which uniquely identifies it. As such, it can be interacted with programmatically as well as by direct manipulation using the mouse and/or the plot’s loon inspector.
For example, via direct manipulation, the plot might have zoomed or
panned so that many of the points no longer appear in the plot. The plot
can be returned to its original position and scaling (either through the
loon inspector’s scale to: plot
or) by calling the function
l_scaleto_plot()
programmatically on the loon data
structure l_p1
as in
l_scaleto_plot(l_p1)
As a base loon structure, an l_plot
has a variety of
l_info_states
available that can be accessed and/or set.
The names of these states are had with
names(l_info_states(l_p1))
or more simply with
names(l_p1)
Their values can be accessed using the []
function. For
example, the background grid lines which appear by default on ggplots,
are called “guides” in loon and the logical state determining whether
they appear or not in the loon plot is "showGuides"
,
accessed as
l_p1["showGuides"]
# [1] TRUE
These can be turned off (loon default) or on (ggplot default) by
simply changing this value – either through the guides
check box in the inspector or programmatically as
# Turn the guides off
l_p1["showGuides"] <- FALSE
# Turn the guides on again
l_p1["showGuides"] <- TRUE
All other info states can also be changed programmatically as well.
For example, a title could be added, the axis labels changed, and a
highly informative "itemLabel"
given for each car.
l_p1["title"] <- "1974 Motor Trend Car Road Tests"
l_p1["xlabel"] <- "Curb weight (1000s of lbs)"
l_p1["ylabel"] <- "Gas mileage (miles per US gallon)"
newlabels <- paste0(rownames(mtcars), "\n ",
c("V-", "straight-")[mtcars$vs + 1], mtcars$cyl, " \n ",
mtcars$disp, " cubic inch \n ",
mtcars$gear, " speed \n ",
c("automatic", "manual")[mtcars$am + 1]
)
l_p1["itemLabel"] <- newlabels
l_p1["showItemLabels"] <- TRUE
The last line above ensures its item label appears whenever the mouse hovers over any point. This gives some interesting information to the curious autophile.
The points might also be visibly distinguished by the transmission
type and the number of cylinders of its engine. Glyph type could
distinguish transmissions and colour the number of cylinders. (To better
distinguish shapes, we also make the glyphs fairly large via the
size
state.)
l_p1["size"] <- 10
l_p1["glyph"][mtcars$am == 0] <- "ccircle"
l_p1["glyph"][mtcars$am == 1] <- "triangle"
gears <- sort(unique(mtcars$gear))
ngears <- length(gears)
cols <- c("lightblue", "steelblue", "black")
for (i in 1:ngears) {
l_p1["color"][mtcars$gear == gears[i]] <- cols[i]
}
We might then zoom in on the "black"
coloured points
(note that these are now hex colors
), and maybe remove the
others from the display (temporarily render them inactive) as
follows.
l_p1["active"] <- l_p1["color"] == l_hexcolor("black")
l_scaleto_active(l_p1)
l_p1["title"] <- "5 speed manual transmission"
The plot should now look something like this (a grid
graphics version of l_p1
):
plot(l_p1)
Again, the plot can be returned to its original scale with all points reactivated as follows:
l_p1["active"] <- TRUE
l_scaleto_plot(l_p1)
l_p1["title"] <- "All points again"
plot(l_p1)
loon
In loon, plots are linked only if they are in the same
linkingGroup
(which can be any string that meaningfully
identifies the group of plots). The linkingGroup
can also
be set programmatically. For the first plot in a
linking group, this can be set using the square bracket notation as
above. For example, instead of “mtcars” we might want a more meaningful
string for the linking group for l_p1
.
l_p1["linkingGroup"] <- "Motor Trend 1974"
The two plots l_p1
and l_h1
are now no
longer linked because they are no longer in the same linking group.
To add l_h1
to this new linking group called “Motor
Trend 1974”, the square bracket assignment will no longer
work. This is because the two plots must be synchronized at the
time any plot is added to an existing group. Instead, l_h1
(and any subsequent plot) must be added to the group using the
l_configure()
function as follows:
l_configure(l_h1, linkingGroup = "Motor Trend 1974", sync = "pull")
Notice, by the way, that the (stacked) colours in the histogram will
follow those of l_p1
. For example,
l_h1["showStackedColors"] <- TRUE
plot(l_h1)
ggplot2loon()
It is possible to declare the linking group of the loon plot at the time it is created from the ggplot.
For example, the relationship between the acceleration and the drive
axle ratio could be of interest; these are measured by the time taken to
accelerate to a quarter mile, qsec
, and drat
.
The gplot
is
# First using another ggplot
p2 <- ggplot(mtcars, aes(x = drat, y = qsec)) + geom_point()
which looks like
p2
The corresponding loon plot can be linked immediately when it is created
l_p2 <- ggplot2loon(p2,
linkingGroup = "Motor Trend 1974",
title = "Acceleration measures",
xlabel = "Drive axle ratio",
ylabel = "Quarter mile (seconds)",
itemLabel = newlabels)
as can any other state of a loon plot, such as title
,
xlabel
, ylabel, or
itemLabel` as shown
here.
The points in the loon plot inherit their colours from the linking group
“Motor Trend 1974”:
plot(l_p2)
Note that l_p2
did not inherit the shapes of its points
from the l_p1
. That is because only those states that have
been defined as linkedStates
are shared by any given
plot.
The linkedStates
of any plot are those which that plot
will share (and hence update) with any other plot in the same
linkingGroup
. The default states that are linked for an
l_plot
like l_p1
are
l_getLinkedStates(l_p1)
# [1] "color" "selected" "active" "size"
and a change in any of these states by any plot in the linking group
will cause the same change in these states on l_p1
. These
are the same states linked by l_p2
.
In contrast, the loon histogram l_h1
has default linked
states
l_getLinkedStates(l_h1)
# [1] "color" "selected" "active"
which excludes size
as not a meaningful state for a loon
histogram (class l_hist
).
The linked states can be modified. For example, we might choose to
not have the second plot respond to changes in the “active” state of
other plots in the same linking group. This is effected by reducing the
states for just the plot l_p2
l_setLinkedStates(l_p1, c("color", "selected", "size"))
A more sophisticated change would be also link the glyphs used for
each point between plots. This will typically be the shape of
the glyph. (Warning: but not always. For example, when
images are used for the points, a little more care needs to be taken).
To do this for both l_p1
and l_p2
we add to
the states being linked as follows.
l_setLinkedStates(l_p1, c("glyph", "active", l_getLinkedStates(l_p1)))
l_setLinkedStates(l_p2, c("glyph", l_getLinkedStates(l_p2)))
Now any change to the glyphs in either plot will change the glyphs in both.
triangles <- l_p1["glyph"] == "triangle"
l_p1["glyph"][triangles] <- "ctriangle"
Here now for l_p1
plot(l_p1)
and propagated to l_p2
plot(l_p2)
Rather complex plots can be built up in ggplot2
by
adding layers. For example,
p_fit <- ggplot(mtcars, aes(drat, mpg)) + geom_smooth() + geom_point()
p_fit
The corresponding interactive loon plot is
l_p_fit <- ggplot2loon(p_fit)
As with ggplots (and grid graphics plots), loon plots have layers. These can be identified (and manipulated) by selecting the “Layers” tab of the inspector.
As can be seen, four distinct layers appear. There is the scatterplot
(from geom_point()
), then a group layer called the
smooth
which has been opened up here by toggling the
triangle at its left to reveal it is composed of a line
and
a polygon
layer (in that order).
The order of the layers is important and represent
their visual order in the corresponding plot. By default, these will
follow the order in which the various geom
s appear in the
construction of the ggplot. For example, had geom_point()
appeared before the geom_smooth()
in the left to right
reading of the ggplot construction, then the Scatterplot
layer would have appeared below the smooth
group layer and
visually obscured all points at the same location.
The order of display can be changed from the inspector via the first two buttons at the bottom left of the layers panel. The order can also be changed programmatically. For example, the scatterplot can be moved lower in the display order as
l_layer_lower(l_p_fit, "model")
and raised back up again so that the points are again on top
l_layer_raise(l_p_fit, "model")
Note that the scatterplot layer identified by its “layer id”, namely
the string "model"
. The function
l_layer_ids( l_p_fit )
will return a list of all the layer
ids from its argument l_p_fit
.
Reading left to right, the remaining buttons on the bottom of the
layers tab on the inspector allow a (selected) layer to be moved out of
a group, into a group, made visible, or made invisible, and add a new
group layer, delete any layer, or scale the plot to the selected layer.
All of these can also be achieved programmatically and numerous other
functions exist in loon to add new layers – see
help("l_layer")
or l_help("learn_R_layer")
for
the range of possibilities.
There is always one layer in loon that is the active layer which at present is always identified as the “model” layer. It is this layer which fields mouse events such as selection and brushing; it is also the layer that reacts to changes in linking states.
In loon, scatterplots and histograms can be identified as the “model”
layer (the former produced via the l_plot()
function, the
latter via l_hist()
).
A loon l_plot()
corresponds to a ggplot
geom_point()
and a loon l_hist()
to a ggplot
geom_histogram()
.
In ggplot2
several geom
s might appear in the
construction of a plot, each one providing a layer. But in
loon
only one layer, namely the "model"
layer,
can be active. This means which of the ggplot layers are intended to be
the active "model"
layer in loon must be determined at the
time of construction (the call to ggplot2loon()
).
Consider the following example where three geom
s are
created as layers in the ggplot. Here a histogram (on a density scale)
is overlaid with a scatter of points along the horizontal axis, and then
with a curve from a density estimate.
p_hdp <- ggplot(mtcars, aes(x = wt, y = ..density..)) +
geom_histogram(binwidth = 0.5,
fill = "grey",
color = "red") +
geom_density(color = "firebrick", lwd = 1.5) +
geom_point(data = data.frame(x = mtcars$wt, y = 0),
mapping = aes(x, y),
color = "firebrick", size = 3)
# the ggplot
p_hdp
To render this in loon, which layer will be interactive needs to be determined. By default, this will be the first layer appearing in the construction of the ggplot – in this case the the histogram. The scatter of points will not be interactive and there are presently no interactive curves in a loon plot.
geom
Layer
The default active layer is the first geom
having a
“model” layer counterpart in a loon plot. In this way, the order of
construction of the ggplot implicitly determines the geom
to be active in loon. Alternatively, the active layer can be explicitly
specified when creating the loon plot via the
activeGeomLayers
argument.
# To have the the histogram active (the first geom in p_hdp)
l_p_Hdp <- ggplot2loon(p_hdp, activeGeomLayers = 1,
linkingGroup = "Motor Trend 1974")
# The following creates an error because the line for the density
# is not currently a possible model layer in loon
# ggplot2loon(p_hdp, activeGeomLayers = 2,
# linkingGroup = "Motor Trend 1974")
#
# To have the points be active (the third geom in p_hdp)
l_p_hdP <- ggplot2loon(p_hdp, activeGeomLayers = 3,
linkingGroup = "Motor Trend 1974")
Note that the colouring will reflect the linking and indicate which layer is actually the active model layer. Check also the layers tab in each of these to see the distinction.
geom_point()
When there are two or more geom_point()
layers in the
ggplot, it might sometime be of interest to have two or more of these
merged into a single active layer.
For example, consider the following ggplot construction
pgps <- ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point(data = subset(mtcars, gear == 3), col = "firebrick") +
geom_point(data = subset(mtcars, gear == 4), col = "steelblue") +
geom_point(data = subset(mtcars, gear == 5), col = "black")
pgps
To create the interactive plot, any of the geom_point
layers could contribute to the active scatterplot model layer.
l_pgps.1 <- ggplot2loon(pgps, linkingGroup = "Motor Trend 1974")
l_pgps.13 <- ggplot2loon(pgps, activeGeomLayers = c(1,3),
linkingGroup = "Motor Trend 1974")
l_pgps.123 <- ggplot2loon(pgps, activeGeomLayers = c(1, 2, 3),
linkingGroup = "Motor Trend 1974")
These plots now have different subsets of the points as their scatterplot “model” or active layer.
WARNING The linking works as expected between these three plots but not with any of the other plots in the same linking group. With a little extra care/work, this too is possible.
linkingKey
One solution to getting the linking correct is to force the
linkingKey
of any new plot to match the keys already
defined by the other plots in the linking group.
The linking key of a loon plot is simply a character vector (default here)
# this group's linking keys can be found from l_p1
groupKeys <- l_p1["linkingKey"]
These values need to be selected and matched to the subsetting used
by the geom_point()
calls in the ggplot.
# the subsets were defined as
dataGeom1 <- mtcars$gear == 3
dataGeom2 <- mtcars$gear == 4
dataGeom3 <- mtcars$gear == 5
As with linkingGroup
, any change to
linkingKey
must also declare whether the existing linked
states are to be pushed to, or pulled from, the group. Hence
l_configure()
must be used.
l_configure(l_pgps.1,
linkingKey = groupKeys[dataGeom1],
sync = "pull")
l_configure(l_pgps.13,
linkingKey = c(groupKeys[dataGeom1],
groupKeys[dataGeom3]),
sync = "pull")
l_configure(l_pgps.123,
linkingKey = c(groupKeys[dataGeom1],
groupKeys[dataGeom2],
groupKeys[dataGeom3]),
sync = "pull")
Alternatively, the linking keys could be set up correctly when the loon plot was created
l_pgps.23 <- ggplot2loon(pgps,
linkingGroup = "Motor Trend 1974",
activeGeomLayers = c(2,3),
linkingKey = c(groupKeys[dataGeom2],
groupKeys[dataGeom3])
)
l_compound
Plots
A significant data analytic feature in the ggplot
pipeline is the notion of faceting the plots, for example, via
facet_wrap()
fwrap <- ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
facet_wrap(~ gear + am)
fwrap
Faceted plots can also be rendered interactive in the same way as any
other ggplot
objects:
l_fwrap <- ggplot2loon(fwrap, linkingGroup = "Motor Trend 1974")
which will look like
Note that because we set the linkingGroup
at the time of
creation (and because the linkingKey
is the the default in
all plots in this group), the interactive faceted plot has pulled the
colours from the other plots in the same linkingGroup
.
Note the class of l_fwrap
:
class(l_fwrap)
# [1] "l_ggplot" "l_compound" "loon"
is an l_ggplot
because it is a loon rendition of a more
complex ggplot
,and it is an l_compound
which
indicates that it is a composition of individual loon plots.
As an l_compound
plot, other features of the composition
can be extracted. Such as the list of plots involved in the
construction:
l_getPlots(l_fwrap)
as well as their layout in the composition
l_getLocations(l_fwrap)
# [,1] [,2]
# [1,] 1 2
# [2,] 3 4
expressed as a matrix of the location of the plots in the order in
which they appear in getPlots()
.
Each plot in l_fwrap
is an independent loon plot which
can be interacted with either through the inspector or
programmatically.
A patchwork
object extends ggplot2
to
multiple displays.
library(patchwork)
p1 <- ggplot(mtcars) + geom_point(aes(mpg, disp))
p2 <- ggplot(mtcars) + geom_boxplot(aes(gear, disp, group = gear))
patchwork <- p1 + p2 # two plots are placed side by side
patchwork
A patchwork
object can be turned to an
l_compound
object.
l_patchwork <- ggplot2loon(patchwork)
class(l_patchwork)
# [1] "l_patchwork" "l_compound" "loon"
By default, all plots in this l_compound
object are
linked with the default linkingGroup
– the unique path name
(e.g., “.l0.patchwork”). Suppose users want all plots un-linked or
linked with other linkingGroup
, simply set,
ggplot2loon(patchwork, linkingGroup = "none") # un-linked
ggplot2loon(patchwork, linkingGroup = "Motor Trend 1974") # join another group
Currently, in this version, not all designs of a
patchwork
object can be transformed to an
l_compound
object, as in
p3 <- ggplot(mtcars) + geom_smooth(aes(disp, qsec))
pp <- (p1 | p2) / p3
pp
# ERROR
ggplot2loon(pp)
In this expression, p1
and p2
, placed side
by side, are combined first as a patchwork
object, then,
p3
, placed at the bottom, combines the
patchwork
object to become a new patchwork
.
Therefore, pp
(a patchwork
object) has two
children, the first child is a patchwork
(p1
|
p2
) and the second is a ggplot
(p3
). So far, in this version, ggplot2loon()
cannot transform a patchwork
object whose children are
nested patchwork
objects.
Alternatively, one can set pp
by functions
plot_layout()
and area()
(see patchwork),
as follow
design <- c(
area(1,1),
area(1,2),
area(2,1,2,2)
)
pp <- p1 + p2 + p3 + plot_layout(design = design)
# Success!
ggplot2loon(pp)
In this way, pp
is a patchwork
object whose
children are ggplot
objects only.
loon.ggplot()
loon.ggplot()
function in loon.ggplot
package is an S3
method and gathers features of both
loon2ggplot()
and ggplot2loon()
. It can take
either a loon
widget or a gg
object and
transform back and forth.
ggplot to loon: loon.ggplot(gg)
is equivalent to
ggplot2loon(gg)
.
See the vignette A Grammar of Interactive Graphics
for
more.