---
title: 'Zigzag expanded navigation plots in R: The R package zenplots'
author: M. Hofert and R. W. Oldford
date: '`r Sys.Date()`'
output:
  rmarkdown::html_vignette: # lighter version than 'rmarkdown::html_document'; see https://bookdown.org/yihui/rmarkdown/r-package-vignette.html; there is also knitr:::html_vignette but it just calls rmarkdown::html_document with a custom .css
    css: style.css # see 3.8 in https://bookdown.org/yihui/rmarkdown/r-package-vignette.html
vignette: >
  %\VignetteIndexEntry{Zigzag expanded navigation plots in R: The R package zenplots}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---
This vignette accompanies the paper "Zigzag expanded navigation plots in R: The R package zenplots".
Note that sections are numbered accordingly (or omitted). Furthermore, it is
recommended to read the paper to follow this vignette.

```{r setup, message = FALSE}
# attaching required packages
library(PairViz)
library(MASS)
library(zenplots)
```

## 2 Zenplots

As example data, we use the `olive` data set:
```{r, message = FALSE}
data(olive, package = "zenplots")
```

Reproducing the plots of Figure 1:
```{r, fig.align = "center", fig.width = 6, fig.height = 8}
zenplot(olive)
```
```{r, fig.align = "center", fig.width = 6, fig.height = 8}
zenplot(olive, plot1d = "layout", plot2d = "layout")
```

Considering the `str()`ucture of `zenplot()` (here formatted for nicer output):
```{r, eval = FALSE}
str(zenplot)
```
```{r, eval = FALSE}
function (x, turns = NULL, first1d = TRUE, last1d = TRUE,
          n2dcols = c("letter", "square", "A4", "golden", "legal"),
          n2dplots = NULL,
          plot1d = c("label", "points", "jitter", "density", "boxplot",
                     "hist", "rug", "arrow", "rect", "lines", "layout"),
          plot2d = c("points", "density", "axes", "label", "arrow",
                     "rect", "layout"),
          zargs = c(x = TRUE, turns = TRUE, orientations = TRUE,
                    vars = TRUE, num = TRUE, lim = TRUE, labs = TRUE,
                    width1d = TRUE, width2d = TRUE,
                    ispace = match.arg(pkg) != "graphics"),
          lim = c("individual", "groupwise", "global"),
          labs = list(group = "G", var = "V", sep = ", ", group2d = FALSE),
          pkg = c("graphics", "grid", "loon"),
          method = c("tidy", "double.zigzag", "single.zigzag"),
          width1d = if (is.null(plot1d)) 0.5 else 1,
          width2d = 10,
          ospace = if (pkg == "loon") 0 else 0.02,
          ispace = if (pkg == "graphics") 0 else 0.037, draw = TRUE, ...)
```


### 2.1 Layout

To investigate the layout options of zenplots a bit more, we need a larger data set. To this end we
simply double the olive data here (obviously only for illustration purposes):
```{r}
olive2 <- cbind(olive, olive) # just for this illustration
```

Reproducing the plots of Figure 2:

```{r, fig.align = "center", fig.width = 8, fig.height = 13.3}
zenplot(olive2, n2dcols = 6, plot1d = "layout", plot2d = "layout",
        method = "single.zigzag")
```
```{r, fig.align = "center", fig.width = 8, fig.height = 8}
zenplot(olive2, n2dcols = 6, plot1d = "layout", plot2d = "layout",
        method = "double.zigzag")
```
```{r, fig.align = "center", fig.width = 8, fig.height = 6.6}
zenplot(olive2, n2dcols = 6, plot1d = "layout", plot2d = "layout",
        method = "tidy")
```

Note that there is also `method = "rectangular"` (leaving the zigzagging zenplot paradigm but
being useful for laying out 2d plots which are not necessarily connected through a variable; note
that in this case, we omit the 1d plots as the default (labels) is rather confusing in this
example):
```{r, fig.align = "center", fig.width = 8, fig.height = 5.4}
zenplot(olive2, n2dcols = 6, plot1d = "arrow", plot2d = "layout",
        method = "rectangular")
```

Reproducing the plots of Figure 3:
```{r, fig.align = "center", fig.width = 6, fig.height = 10}
zenplot(olive, plot1d = "layout", plot2d = "layout", method = "double.zigzag",
        last1d = FALSE, ispace = 0.1)
```
```{r, fig.align = "center", fig.width = 6, fig.height = 7}
zenplot(olive, plot1d = "layout", plot2d = "layout", n2dcol = 4, n2dplots = 8,
        width1d = 2, width2d = 4)
```


## 3 Zenpaths

A very basic path (standing for the sequence of pairs (1,2), (2,3), (3,4), (4,5)):
```{r}
(path <- 1:5)
```

A zenpath through all pairs of variables (Eulerian):
```{r}
(path <- zenpath(5))
```

If `dataMat` is a five-column matrix, the zenplot of all pairs would then be constructed as follows:
```{r, eval = FALSE}
zenplot(x = dataMat[,path])
```

The `str()`ucture of `zenpath()` (again formatted for nicer output):
```{r, eval = FALSE}
str(zenpath)
```
```{r, eval = FALSE}
function (x, pairs = NULL,
          method = c("front.loaded", "back.loaded", "balanced",
                     "eulerian.cross", "greedy.weighted", "strictly.weighted"),
          decreasing = TRUE)
```

Here are some methods for five variables:
```{r}
zenpath(5, method = "front.loaded")
zenpath(5, method = "back.loaded")
zenpath(5, method = "balanced")
```

The following method considers two groups: One of size three, the other of size five.
The sequence of pairs is constructed such that the first variable comes from the first group,
the second from the second.
```{r}
zenpath(c(3,5), method = "eulerian.cross")
```

Reproducing Figure 4:
```{r, fig.align = "center", fig.width = 6, fig.height = 9}
oliveAcids <- olive[, !names(olive) %in% c("Area", "Region")] # acids only
zpath <- zenpath(ncol(oliveAcids)) # all pairs
zenplot(oliveAcids[, zpath], plot1d = "hist", plot2d = "density")
```


## 4 Build your own zenplots

### 4.3 Custom layout and plots -- a spiral of ggplots example

Figure 5 can be reproduced as follows (note that we do not show the plot
here due to a CRAN issue when running this vignette):
```{r, fig.align = "center", fig.width = 8, fig.height = 7.2, eval = FALSE}
path <- c(1,2,3,1,4,2,5,1,6,2,7,1,8,2,3,4,5,3,6,4,7,3,8,4,5,6,7,5,8,6,7,8)
turns <- c("l",
           "d","d","r","r","d","d","r","r","u","u","r","r","u","u","r","r",
           "u","u","l","l","u","u","l","l","u","u","l","l","d","d","l","l",
           "u","u","l","l","d","d","l","l","d","d","l","l","d","d","r","r",
           "d","d","r","r","d","d","r","r","d","d","r","r","d","d")

library(ggplot2) # for ggplot2-based 2d plots
stopifnot(packageVersion("ggplot2") >= "2.2.1") # need 2.2.1 or higher
ggplot2d <- function(zargs) {
  r <- extract_2d(zargs)
  num2d <- zargs$num/2
  df <- data.frame(x = unlist(r$x), y = unlist(r$y))
  p <- ggplot() +
    geom_point(data = df, aes(x = x, y = y), cex = 0.1) +
    theme(axis.line = element_blank(),
          axis.ticks = element_blank(),
          axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.title.x = element_blank(),
          axis.title.y = element_blank())
  if(num2d == 1) p <- p +
    theme(panel.background = element_rect(fill = 'royalblue3'))
  if(num2d == (length(zargs$turns)-1)/2) p <- p +
    theme(panel.background = element_rect(fill = 'maroon3'))
  ggplot_gtable(ggplot_build(p))
}

zenplot(as.matrix(oliveAcids)[,path], turns = turns, pkg = "grid",
        plot2d = function(zargs) ggplot2d(zargs))
```


### 4.4 Data groups

Split the olive data set into three groups (according to their variable `Area`):
```{r}
oliveAcids.by.area <- split(oliveAcids, f = olive$Area)
# Replace the "." by " " in third group's name
names(oliveAcids.by.area)[3] <- gsub("\\.", " ", names(oliveAcids.by.area)[3])
names(oliveAcids.by.area)
```

Reproducing the plots of Figure 6 (note that `lim = "groupwise"` does not
make much sense here as a plot):
```{r, fig.align = "center", fig.width = 6, fig.height = 8}
zenplot(oliveAcids.by.area, labs = list(group = NULL))
```
```{r, fig.align = "center", fig.width = 6, fig.height = 8}
zenplot(oliveAcids.by.area, lim = "groupwise", labs = list(sep = " - "),
        plot1d = function(zargs) label_1d_graphics(zargs, cex = 0.8),
        plot2d = function(zargs)
            points_2d_graphics(zargs, group... = list(sep = "\n - \n")))
```


### 4.5 Custom zenpaths

Find the "convexity" scagnostic for each pair of olive acids.
```{r, message = FALSE}
library(scagnostics)
Y <- scagnostics(oliveAcids) # compute scagnostics (scatter-plot diagonstics)
X <- Y["Convex",] # pick out component 'convex'
d <- ncol(oliveAcids)
M <- matrix(NA, nrow = d, ncol = d) # matrix with all 'convex' scagnostics
M[upper.tri(M)] <- X # (i,j)th entry = scagnostic of column pair (i,j) of oliveAcids
M[lower.tri(M)] <- t(M)[lower.tri(M)] # symmetrize
round(M, 5)
```

Show the six pairs with largest "convexity" scagnostic:
```{r}
zpath <- zenpath(M, method = "strictly.weighted") # list of ordered pairs
head(M[do.call(rbind, zpath)]) # show the largest six 'convexity' measures
```

Extract the corresponding pairs:
```{r}
(ezpath <- extract_pairs(zpath, n = c(6, 0))) # extract the first six pairs
```

Reproducing Figure 7 (visualizing the pairs):
```{r, message = FALSE, fig.align = "center", fig.width = 6, fig.height = 6}
library(graph)
library(Rgraphviz)
plot(graph_pairs(ezpath)) # depict the six most convex pairs (edge = pair)
```

Connect them:
```{r}
(cezpath <- connect_pairs(ezpath)) # keep the same order but connect the pairs
```

Build the corresponding list of matrices:
```{r}
oliveAcids.grouped <- groupData(oliveAcids, indices = cezpath) # group data for (zen)plotting
```

Reproducing Figure 8 (zenplot of the six pairs of acids with largest "convexity" scagnostic):
```{r, fig.align = "center", fig.width = 6, fig.height = 8}
zenplot(oliveAcids.grouped)
```


## 5 Advanced features

### 5.1 The structure of a zenplot

Here is the structure of a return object of `zenplot()`:
```{r}
res <- zenplot(olive, plot1d = "layout", plot2d = "layout", draw = FALSE)
str(res)
```

Let's have a look at the components. The occupancy matrix encodes the occupied
cells in the rectangular layout:
```{r}
res[["path"]][["occupancy"]]
```

The two-column matrix `positions` contains in the *i*th row the row and column
index (in the occupancy matrix) of the *i*th plot:
```{r}
head(res[["path"]][["positions"]])
```

### 5.2 Tools for writing 1d and 2d plot functions

Example structure of 2d plot based on `graphics`:
```{r}
points_2d_graphics
```

For setting up the plot region of plots based on `graphics`:
```{r}
plot_region
```

Determining the indices of the two variables to be plotted in the current 1d or 2d plot
(the same for 1d plots):
```{r}
plot_indices
```

Basic check that the return value of `zenplot()` is actually the return value of the
underlying `unfold()` (note that, the output of `unfold` and `res` is not identical since `res` has specific class attributes):
```{r}
n2dcols <- ncol(olive) - 1 # number of faces of the hypercube
uf <- unfold(nfaces = n2dcols)

identical(res, uf) #return FALSE
for(name in names(uf)) {
   stopifnot(identical(res[[name]], uf[[name]]))
}
```