--- title: "The persistence class" vignette: > %\VignetteIndexEntry{The persistence class} %\VignetteEngine{quarto::html} %\VignetteEncoding{UTF-8} knitr: opts_chunk: collapse: true comment: '#>' --- ```{r} #| label: setup library(phutil) ``` ## Structure of the class An object of class `persistence` is a list of 2 elements: - `pairs`: A list of 2-column matrices containing birth-death pairs. The $i$-*th* element of the list corresponds to the $(i-1)$-*th* homology dimension. If there is no pairs for a given dimension but there are pairs in higher dimensions, the corresponding element(s) is/are filled with a \eqn{0 \times 2} numeric matrix with 0 rows. - `metadata`: A list of length 6 containing information about how the data was computed: - `orderered_pairs`: A boolean indicating whether the pairs in the `pairs` list are ordered (i.e. the first column is strictly less than the second column). - `data`: The name of the object containing the original data on which the persistence data was computed. - `engine`: The name of the package and the function of this package that computed the persistence data in the form `"package_name::package_function"`. - `filtration`: The filtration used in the computation in a human-readable format (i.e. full names, capitals where needed, etc.). - `parameters`: A list of parameters used in the computation. - `call`: The exact call that generated the persistence data. ## Supported inputs The `persistence` class is designed to support a variety of inputs, including A single numeric matrix : If the user provides a matrix, it must have at least 2 columns and each row represents a topological feature. - If it has 2 columns, we assume that the first column corresponds to the birth of a feature and the second column corresponds to the death of a feature, irrespective of the order of the columns. In this case, we assume that the homology dimension of the feature is 0. - If it has more than 2 columns, we assume that the first column corresponds to the homology dimension of the feature, the second column corresponds to the birth of a feature, and the third column corresponds to the death of a feature, irrespective of the order of the columns. The remaining columns are ignored. A list of numeric matrices : If the user provides a list of matrices, each list element corresponds to an homology dimension, from 0 to some maximum value. Each matrix must have at least 2 columns and each row represents a topological feature in the corresponding homology dimension (given by the matrix index in the list minus 1). Each matrix is parsed as described above. A dataframe : If the user provides an object of class `data.frame`, it must have at least 2 columns and each row represents a topological feature. If it has exactly 2 columns, we add a `dimension` column with all values set to 0. If it has more than 2 columns, we require that `birth` and `death` exist in the column names. The `birth` and `death` columns are parsed as described above. The remaining columns are ignored. An object of class 'PHom' : If the user provides an object of class 'PHom' as typically produced by `ripserr::vietoris_rips()`, it means that it is a `base::data.frame` with columns `dimension`, `birth`, and `death` in that specific order. The `dimension` column is of type integer while the `birth` and `death` columns are of type numeric. The `dimension` column is used to create a list of matrices, where each matrix corresponds to an homology dimension, from 0 to the maximum value in the `dimension` column. An object of class 'diagram' : If the user provides an object of class 'diagram' as typically produced by `TDA::*Diag()` functions in entry `diagram`, it means that it is a `base::matrix` with 3 columns with names `dimension`, `Birth` and `Death` in that specific order. The `dimension` column is of type integer while the `Birth` and `Death` columns are of type numeric. Furthermore, the object stores as attributes the parameters used to compute the diagram and the entire call to the function that produced the diagram. We first lowercase `Birth` and `Death`. Next, the `dimension` column is used to create a list of matrices, where each matrix corresponds to an homology dimension, from 0 to the maximum value in the `dimension` column. The `birth` and `death` columns are parsed as described above. The remaining columns are ignored. An object of class 'hclust' : If the user provides an object of class 'hclust' as typically produced by `stats::hclust()`, it means that it is a `base::list` which contains the `height` element which is a set of $n−1$ real values (non-decreasing for ultrametric trees) storing the clustering height, that is, the value of the criterion associated with the clustering method for the particular agglomeration. This is used as homological feature death while a birth of `0` is typically used.