---
title: "QuAnTeTrack"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{QuAnTeTrack}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{=html}
```
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.width = 16,
fig.height = 8,
fig.retina = NULL,
out.width = "100%"
)
```
```{css, echo=FALSE}
```
::: {style="text-align: center;"}
:::
```{css, echo=FALSE}
pre {
max-height: 300px;
overflow-y: auto;
}
pre[class] {
max-height: 500px;
}
```
```{r, include = FALSE}
if (Sys.getenv("RGL_USE_NULL") == "" && !interactive()) {
Sys.setenv(RGL_USE_NULL = "TRUE")
}
```
## **Getting Started with QuAnTeTrack**
### **Installation**
To install the **QuAnTeTrack** package, you can choose between installing the **stable version from CRAN** (recommended) or the **development version from GitHub**.
#### **From CRAN (recommended)**
To install the stable version from CRAN, use:
``` r
install.packages("QuAnTeTrack")
```
#### **From GitHub (development version)**
If you want the latest development version, you will need to use the `devtools` package. If you haven't installed `devtools` yet, you can do so with the following command:
``` r
install.packages("devtools")
```
Once `devtools` is installed, you can install **QuAnTeTrack** using:
``` r
devtools::install_github("MacroFunUV/QuAnTeTrack")
```
If you have already installed **QuAnTeTrack** and want to ensure you have the latest version, you can update it with:
``` r
devtools::install_github("MacroFunUV/QuAnTeTrack", force = TRUE)
```
### **Loading the Package**
Once installed, you can load the package using:
```{r setup}
library(QuAnTeTrack)
```
This command will make all the functions from **QuAnTeTrack** available for use. You are now ready to begin your trackway analysis!
## **Overview of the Analytical Workflow in QuAnTeTrack**
The **QuAnTeTrack** package (**Qu**antitative **An**alysis of **Te**trapod **Track**ways) provides a structured and comprehensive workflow for analyzing trackway data, facilitating the assessment of paleoecological and paleoethological hypotheses. The workflow integrates various functions for data digitization, loading, exploratory analysis, statistical testing, simulation, similarity assessment, intersection detection, and clustering. This pipeline aims to help researchers reconstruct, compare, and interpret movement patterns and behavioral dynamics of trackmakers.
### **1. Data Digitization and Preprocessing**
The first step involves **digitizing the trackway data** using the **TPS software suite**, particularly:
- **tpsUtil** (*Rohlf, 2008*): For compiling and converting `.TPS` files.\
- **tpsDig** (*Rohlf, 2009*): For digitizing footprint coordinates from trackways.
The digitization process should ensure that the footprints are consistently recorded across all tracks. This process is essential for converting raw images into structured data for further analysis.
### **2. Loading Data with `tps_to_track()`**
Once digitized, the data is loaded into **QuAnTeTrack** using the `tps_to_track()` function. This function:
- Reads `.TPS` files containing digitized footprints within tracks.
- Extracts and organizes data into structured `track` R objects.
- Handles missing footprints through interpolation if required.
- Converts raw data into real-world measurements using user-specified scales.
The resulting **`track` R objects** contain:
- **Trajectories:** Interpolated pathways derived from midpoints between footprints.\
- **Footprints:** Original digitized points and metadata for each track.
Additionally, if the dataset is extensive, users can utilize the **`subset_track()`** function to isolate specific tracks for focused analysis. This step helps avoid computational overhead and allows customized analyses of selected trajectories.
### **3. Exploratory Analysis of Track Parameters**
Before testing specific hypotheses, users should perform an initial exploration of the data. This includes:
- **Visual Inspection of Tracks (`plot_track()`)**:\
Generates visualizations of trackways and footprints to inspect their overall structure. The function offers various modes:
- Plotting only footprints
- Plotting only tracks
- Plotting both footprints and tracks
- **Parameter Calculation (`track_param()`)**:\
Calculates essential movement parameters, including:
- Step lengths
- Turning angles
- Total distance and track length
- Sinuosity
- Straightness
- **Velocity Calculation (`velocity_track()`)**:\
Estimates velocities and relative stride lengths for each track, applying formulas based on empirical studies. This step is crucial for understanding speed dynamics and comparing them across different trackmakers or scenarios.
- **Visualization of Velocity Patterns (`plot_velocity()`)**:\
Provides a detailed view of how velocity or relative stride length changes along each track. This visualization is essential for identifying patterns of acceleration, deceleration, or steady movement.
- **Direction Analysis (`plot_direction()`)**:\
Provides various visualization options to explore trackway directionality:
- Boxplots of step directions
- Polar histograms of step directions and average directions
- Faceted plots for comparing multiple tracks
These functions help identify general patterns and irregularities in the data before proceeding with formal statistical testing.
### **4. Testing Directional and Velocity Patterns**
To assess whether tracks exhibit distinct movement patterns, the following statistical tests can be applied:
- **Testing Velocity (`test_velocity()`)**:
- Compares mean velocities across tracks using ANOVA, Kruskal-Wallis, or GLM.
- Performs pairwise comparisons if necessary.
- Provides visualizations of velocity distributions across different tracks.
- **Movement Mode Analysis (`mode_velocity()`)**:
- Applies Spearman’s rank correlation to detect trends of **acceleration**, **deceleration**, or **steady movement** along each track.
- **Testing Direction (`test_direction()`)**:
- Compares mean directions across tracks using ANOVA, Kruskal-Wallis, or GLM.\
- Performs pairwise comparisons if necessary.
These statistical tests allow researchers to rigorously compare and quantify movement characteristics, providing a foundation for hypothesis testing.
### **5. Simulation-Based Hypothesis Testing (`simulate_track()`)**
The `simulate_track()` function generates **simulated trajectories** based on different movement models to test specific hypotheses. Three models are available:
- **Directed Model:** Represents highly constrained, purposeful movement along a consistent direction.\
- **Constrained Model:** Generates correlated random walks, suitable for partially directed movement.\
- **Unconstrained Model:** Represents fully random exploratory movement.
These models can be **informed by geological data** (e.g., sedimentology, paleogeomorphology, etc.) to test the influence of environmental constraints on movement. For example, natural barriers or features inferred from geological evidence may restrict the range of simulated paths.
The `plot_sim()` function overlays simulated tracks on the actual trajectories, allowing users to visually assess how well different models replicate observed track patterns. This visual comparison is essential for evaluating the realism of simulated tracks.
### **6. Comparing Simulated and Empirical Tracks**
The **QuAnTeTrack** package offers several functions aimed at comparing similarity and intersection metrics between two or more actual tracks. These metrics are then evaluated against simulated datasets to determine the probability of observing such similarity or intersection counts under scenarios of independent (non-coordinated) movement.
- **Dynamic Time Warping (`simil_DTW_metric()`)**:\
Compares trajectories based on the optimal alignment of points, allowing for variable path lengths.
- **Fréchet Distance (`simil_Frechet_metric()`)**:\
Measures similarity by comparing the overall shape of trajectories, focusing on global rather than local alignment.
- **Track Intersections (`track_intersection()`)**:\
Identifies and counts unique intersections between tracks, which can indicate interaction or coordinated movement.
### **7. Combining Probability Metrics (`combined_prob()`)**
The `combined_prob()` function integrates *p*-values from multiple similarity metrics and intersection tests to provide a more robust assessment of observed patterns. This approach offers an overall measure of significance, enhancing the reliability of the results by accounting for different aspects of similarity and interaction.
### **8. Clustering Analysis (`cluster_track()`)**
The `cluster_track()` function is an optional but powerful step that can be applied **before formal statistical testing**. It clusters tracks based on calculated movement parameters, identifying groups of tracks with similar behaviors. The clustering process:
- Facilitates targeted testing of specific behavioral hypotheses (e.g., gregarious movement).
- Helps filter relevant datasets before applying similarity metrics.
- Informs the selection of appropriate simulation models by identifying common movement characteristics.
## **Raw Data Format**
**QuAnTeTrack** accepts raw data in the form of **.TPS files** containing footprint coordinates. Each track should be recorded as a different image within the **.TPS file**.
### **Requirements**
- **Footprint coordinates** should be digitized in **equivalent positions** within each footprint.\
- **Tracks with missing footprints** are acceptable and will be **interpolated** as needed by the package functions.\
- It is **recommended** to digitize the coordinates using the **TPS software suite**, particularly:
- **tpsUtil** (*Rohlf, 2008*) - for file manipulation and data conversion.\
- **tpsDig** (*Rohlf, 2009*) - for digitizing landmarks and outlines.
This vignette demonstrates how to **load, process, and analyze trackway data** using the **QuAnTeTrack** package. We will walk through the **Paluxy River** and the **Mount Tom** datasets, representing dinosaur tracks from the Paluxy River site (*Farlow et al., 2012*) and the Mount Tom site (*Ostrom, 1972*), respectively. Examples of `.tps` files of these datasets can be downloaded here:
- [Paluxy River](https://github.com/MacroFunUV/QuAnTeTrack/blob/master/inst/extdata/PaluxyRiver.tps)
- [Mount Tom](https://github.com/MacroFunUV/QuAnTeTrack/blob/master/inst/extdata/MountTom.tps)
## **Loading and Converting Data**
The **`tps_to_track()`** function is an essential component of the **QuAnTeTrack** package, designed to transform raw **`.TPS` files** containing digitized trackway data into structured **`track` R objects**. This tool is particularly useful for reconstructing trackways from footprints digitized using the **TPS software suite**, such as **tpsUtil** and **tpsDig**. The function reads the raw **`.TPS` files**, extracts the coordinate data, and processes it to generate **`track` R objects** that are compatible with the analytical tools provided by **QuAnTeTrack**.
The **`tps_to_track()`** function reads **`.TPS` files** where each track is represented by a series of **(x, y) coordinates** stored as separate images. These data points are then processed to generate **trajectory coordinates** by calculating the **midpoints between consecutive footprints**. These trajectories serve as reconstructed pathways, allowing users to analyze overall movement patterns. When missing footprints are encountered, the function can interpolate their positions based on the locations of adjacent footprints and the specified side (left or right) of the initial footprint.
Several arguments are provided to customize data handling. The **`file`** argument specifies the path to the `.TPS` file, while the **`scale`** argument allows users to define a scale factor (in **meters per pixel**) to convert coordinates to real-world measurements. To account for missing footprints, the **`missing`** argument specifies whether interpolation is required, while the **`NAs`** argument provides a matrix detailing which footprints need interpolation. Additionally, the **`R.L.side`** argument identifies whether the first footprint of each track belongs to the **left or right side**, which is essential when dealing with incomplete trackways.
The function generates a **`track` R object** consisting of two main components: **`Trajectories`** and **`Footprints`**. The **`Trajectories`** element contains a list of interpolated trajectories, where each trajectory represents a series of midpoints calculated between consecutive footprints. The **`Footprints`** element comprises a list of data frames with the original footprint coordinates, associated metadata (such as image reference and ID), and an indicator specifying whether each footprint is actual or inferred.
The resulting **`track` R object** provides a comprehensive framework for organizing digitized trackway data, making it compatible with the various analytical functions within the **QuAnTeTrack** package. This structured data format enables users to perform advanced analyses such as calculating movement parameters, testing hypotheses about trackmaker behavior, and comparing tracks using similarity metrics. By transforming raw data into structured objects, the **`tps_to_track()`** function serves as a foundational step in the broader analytical pipeline provided by **QuAnTeTrack**.
### **Examples of Usage**
Here, the TPS files (**PaluxyRiver.tps** and **MountTom.tps**) are loaded using the `system.file()` function to ensure compatibility across systems. This approach is necessary because these files are stored as **internal data within the package** (specifically, in the `inst/extdata/` folder). Using `system.file()` ensures that the files can be accessed regardless of the user's operating system or working directory, making the vignette fully portable and reproducible. They are then converted to `track` R objects using the `tps_to_track()` function. The `scale` argument is used to set the coordinate scaling factor. For the **PaluxyRiver** dataset, no footprints are missing, so the `missing` argument is set to `FALSE` and `NAs = NULL`. For the **MountTom** dataset, some footprints are missing, so the `missing` argument is set to `TRUE`, and the missing footprints are specified using the `NAs` matrix. Additionally, the `R.L.side` argument is provided to specify the side of the first footprint of each track (either "R" for right or "L" for left).
For users working with their own data, **replace** `system.file("extdata", "PaluxyRiver.tps", package = "QuAnTeTrack")` and `system.file("extdata", "MountTom.tps", package = "QuAnTeTrack")` with the **file paths to your .TPS files** (e.g., `"C:/path/to/your/PaluxyRiver.tps"` and `"C:/path/to/your/MountTom.tps"`).
```{r, eval=FALSE}
PaluxyRiver <- tps_to_track(
system.file("extdata", "PaluxyRiver.tps", package = "QuAnTeTrack"),
scale = 0.004341493,
missing = FALSE,
NAs = NULL
)
```
```{r echo=FALSE}
PaluxyRiver <- tps_to_track(
system.file("extdata", "PaluxyRiver.tps", package = "QuAnTeTrack"),
scale = 0.004341493,
missing = FALSE,
NAs = NULL
)
```
```{r, eval=FALSE}
MountTom <- tps_to_track(
system.file("extdata", "MountTom.tps", package = "QuAnTeTrack"),
scale = 0.004411765,
missing = TRUE,
NAs = matrix(c(7, 3), nrow = 1, ncol = 2),
R.L.side = c(
"R", "L", "L", "L", "R", "L", "R", "R", "L", "L", "L", "L", "L",
"R", "R", "L", "R", "R", "L", "R", "R", "R", "R"
)
)
```
```{r echo=FALSE}
MountTom <- tps_to_track(
system.file("extdata", "MountTom.tps", package = "QuAnTeTrack"),
scale = 0.004411765,
missing = TRUE,
NAs = matrix(c(7, 3), nrow = 1, ncol = 2),
R.L.side = c(
"R", "L", "L", "L", "R", "L", "R", "R", "L", "L", "L", "L", "L",
"R", "R", "L", "R", "R", "L", "R", "R", "R", "R"
)
)
```
## **Subsetting Tracks from Track Data**
The **`subset_track()`** function is designed to extract specific tracks from a larger dataset of tracks, making it easier to focus on particular **trajectories or footprints** for further analysis or visualization. This function is particularly useful when working with **extensive datasets** where only a subset of tracks is relevant to the research question.
The function operates by taking a **`track` R object**, which contains two elements: **`Trajectories`** and **`Footprints`**. Each of these elements is a **list**, where each list entry corresponds to a separate track. By specifying the desired indices through the **`tracks`** argument, users can isolate particular tracks of interest.
If the **`tracks`** argument is left unspecified (**`NULL`**), the function defaults to returning **all tracks** in the dataset. Otherwise, it subsets the dataset based on the **indices provided**. If any indices are **outside the range of available tracks**, they are ignored with a **warning message** to notify the user. This functionality ensures robustness when working with datasets of varying sizes.
The function returns a modified **`track` R object** with the same structure as the original, but only containing the specified tracks. This approach maintains compatibility with other functions that expect a **`track` R object**, allowing for seamless integration into broader analytical workflows.
### **Examples of Usage**
To prepare a subset of tracks with more than **three footprints** from the **MountTom dataset** for later analyses, you can use the **`subset_track()`** function. This is especially useful for focusing on a selection of tracks of interest before applying similarity metrics, simulations, or statistical tests.
```{r, eval=FALSE}
sbMountTom <- subset_track(MountTom, tracks = c(1, 2, 3, 4, 7, 8, 9, 13, 15, 16, 18))
print(sbMountTom)
```
```{r echo=FALSE}
sbMountTom <- subset_track(MountTom, tracks = c(1, 2, 3, 4, 7, 8, 9, 13, 15, 16, 18))
print(sbMountTom)
```
## **Plotting Tracks**
The **`plot_track()`** function is a versatile tool designed to visualize **track and footprint data** from a **`track` R object** in various ways, providing a flexible approach to examining and presenting trackway datasets. This function generates customizable plots using the **`ggplot2`** package, allowing users to inspect individual tracks, footprints, or a combination of both. By adjusting various plotting parameters, users can tailor their visualizations to highlight specific aspects of the dataset, such as **individual track paths**, **footprint shapes**, and **colors**.
The **`plot_track()`** function allows users to choose between three plotting modes: plotting only the **footprints**, only the **interpolated trackways**, or a **combination of both**. This is controlled by the **`plot`** argument, which can be set to **`"Footprints"`**, **`"Tracks"`**, or **`"FootprintsTracks"`** (default). The footprints and tracks are plotted using different layers, with footprints represented by **points** and tracks represented by **lines**.
Additional customization options include changing **colors**, **sizes**, **shapes**, and **transparency** of the plotted elements. Users can provide a vector of colors via the **`colours`** argument, which allows different tracks to be plotted in different colors. The **`cex.f`** and **`cex.t`** arguments control the sizes of footprint points and track lines, respectively. The **`shape.f`** argument allows users to specify the shapes of footprint points, while the **`alpha.f`**, **`alpha.t`**, and **`alpha.l`** arguments control the transparency of footprints, track lines, and labels, respectively.
The **`plot_track()`** function also supports the addition of **labels** to individual tracks. If the **`plot.labels`** argument is set to **`TRUE`**, labels are displayed at the start of each track, with the label text determined by the **`labels`** argument. If labels are not provided, the function automatically generates labels based on track names in the original TPS file. Users can adjust the label size using the **`cex.l`** argument and control the padding around the labels with the **`box.p`** argument.
The **`plot_track()`** function returns a **`ggplot`** object, which can be further customized using additional **`ggplot2`** functions. This allows users to enhance their plots with additional layers, themes, and annotations as needed.
The function is especially useful for comparing **multiple trackways at once**, providing a comprehensive view of **track distribution**, **direction**, and **spacing**. It also allows users to produce **clean visualizations** suitable for **presentation or publication**.
### **Examples of Usage**
By default, **`plot_track()`** displays both footprints and interpolated trajectories. This is useful for getting a general overview of the track and its corresponding interpolated pathways.
```{r echo=TRUE}
plot_track(PaluxyRiver)
```
```{r echo=TRUE}
plot_track(MountTom)
```
To visualize only the footprint data without the interpolated trajectories, use the `plot = "Footprints"` argument. This is particularly useful when you want to inspect the original footprint positions without the influence of interpolated tracks.
```{r echo=TRUE}
plot_track(PaluxyRiver, plot = "Footprints")
```
```{r echo=TRUE}
plot_track(MountTom, plot = "Footprints")
```
If you want to focus on the interpolated trackways without displaying the footprints, use the `plot = "Tracks"` argument. This visualization helps analyze the continuity and pattern of movement.
```{r echo=TRUE}
plot_track(PaluxyRiver, plot = "Tracks")
```
```{r echo=TRUE}
plot_track(MountTom, plot = "Tracks")
```
The **`plot_track()`** function allows flexible customization to improve the clarity and presentation of trackway data. Users can label tracks, change footprint shapes, adjust colors, and control label size and transparency.
In this first example, tracks from the **Mount Tom** dataset are labeled using `paste()` to generate names like `"Track 1"`, `"Track 2"`, etc. Labels are enlarged with **`cex.l = 4`**, given padding using **`box.p = 0.3`**, and made semi-transparent with **`alpha.l = 0.7`**.
```{r echo=TRUE}
labels <- paste("Track", seq_along(MountTom[[1]]))
plot_track(MountTom, plot.labels = TRUE, labels = labels, cex.l = 4, box.p = 0.3, alpha.l = 0.7)
```
In the second example, we plot only **footprints** from the **Paluxy River** dataset, using **`colours = c("red", "orange")`** to distinguish tracks and **`shape.f = c(15, 18)`** to assign different shapes to footprints—useful for visually comparing trackmakers.
```{r echo=TRUE}
plot_track(PaluxyRiver, plot = "Footprints", colours = c("red", "orange"), shape.f = c(15, 18))
```
## **Extracting Track Parameters**
The **`track_param()`** function is designed to compute and display various parameters related to the **movement patterns of tracks** from a **`track` R object**. This function is essential for extracting detailed information about the **structure of individual tracks** and their **spatial relationships**, providing key metrics that can be used for further analysis, comparison, and visualization. The **`track_param()`** function utilizes several helper functions from the **`trajr`** package, which is commonly applied in **animal movement analysis**.
The **`track_param()`** function works by iterating over each trajectory within the provided track data and computing a set of **movement-related parameters**. These include **turning angles**, **step lengths**, **total distances covered**, **track lengths**, and measures of **sinuosity and straightness**. Such parameters are crucial for understanding the **locomotor patterns of trackmakers** and assessing their **movement efficiency**.
The **turning angles** are calculated using the **`trajr::TrajAngles()`** function, providing a measure of **directional changes** at each step. The **mean turning angle** and **standard deviation** are also calculated to summarize overall turning behavior. The **distance covered by the track** is obtained using the **`trajr::TrajDistance()`** function, which measures the total **straight-line distance** between the start and end points of the track. The **track length** is calculated using the **`trajr::TrajLength()`** function, which sums the distances between all consecutive points in the track. The **step lengths**, representing the distances between consecutive points, are calculated with **`trajr::TrajStepLengths()`**. The function also computes the **mean and standard deviation of these step lengths**. The **sinuosity** of the track is calculated using the **`trajr::TrajSinuosity2()`** function, which quantifies how much a path deviates from a straight line. This measure of sinuosity is based on the method described by *Benhamou (2004)*, which refines previous methods to provide more accurate estimates of tortuosity for paths with varying turning angles and step lengths. The **straightness index** is calculated with **`trajr::TrajStraightness()`**, defined as the ratio between the beeline distance (start to end) and the total path length. This measure is based on the work of *Batschelet (1981)* and provides insight into how direct or meandering the movement of the trackmaker was.
The calculation of **sinuosity** is based on the formula:\
$$
S = 2 \left[ p \left( \frac{1 + c}{1 - c} + b^2 \right) \right]^{-0.5}
$$
where:
- $p$ is the mean step length (in meters),
- $c$ is the mean cosine of turning angles (in radians), and
- $b$ is the coefficient of variation of the step length (in meters).
The **straightness index** is calculated as the ratio $D/L$, where $D$ is the beeline distance between the first and last points of the trajectory, and $L$ is the total path length. This index is particularly useful for comparing the efficiency of directed walks, but it is not suitable for random trajectories, where the index tends towards zero with increasing steps.
The **`track_param()`** function returns a **list of lists**, where each sublist contains the **computed parameters** for a corresponding track. The parameters include: **turning angles**, **mean turning angle**, **standard deviation of turning angles**, **distance**, **length**, **step lengths**, **mean step length**, **standard deviation of step length**, **sinuosity** and **straightness**.
The **reference direction for calculating angles** is considered to be along the positive x-axis, with angles measured counterclockwise. The computed parameters are returned in a structured format, allowing users to further process or visualize the data as needed.
The **`track_param()`** function provides valuable insights into the **structure and efficiency of trackmaker movements**, making it a crucial tool for analyzing **fossil trackways**.
### **Examples of Usage**
The **`track_param()`** function extracts movement parameters such as turning angles, step lengths, distances, track lengths, sinuosity, and straightness from **`track` R objects**. The examples below calculate these parameters for the **Paluxy River** and **Mount Tom** datasets.
```{r, eval=FALSE}
params_paluxy <- track_param(PaluxyRiver)
```
```{r echo=FALSE}
params_paluxy <- track_param(PaluxyRiver)
```
```{r, eval=FALSE}
params_mount <- track_param(MountTom)
```
```{r echo=FALSE}
params_mount <- track_param(MountTom)
```
## **Calculating Velocities and Relative Stride Lengths**
The **`velocity_track()`** function calculates the **velocities** and **relative stride lengths** for each step within a series of tracks. It requires a **`track` R object** as input, which contains both **trajectories and footprints**, and uses the **height at the hip**, `H`, for each track maker to estimate speed. The `H` argument should be supplied as a numeric value representing the hip height in meters. If the hip height is unknown, it must be estimated from skeletal proportions or other anatomical information. The accuracy of velocity calculations depends heavily on providing a realistic value for this parameter. The function supports two calculation methods: **Method A** (*Alexander, 1976*) and **Method B** (*Ruiz & Torices, 2013*), which are specified via the `method` argument. By default, **Method A** is applied to all tracks if no method is specified. The **gravitational acceleration**, `G`, is set to **9.8 m/s^2^** by default.
This function works by first extracting the track data and then calculating the **Euclidean distance** between consecutive footprints to determine the **stride length**. For each step, the **velocity** is calculated using one of the two methods.
**Method A** applies the formula (*Alexander, 1976*):
$$
v = 0.25 \cdot \sqrt{G} \cdot S^{1.67} \cdot H^{-1.17}
$$
where $v$ is the **velocity (m/s)**, $G$ is **gravitational acceleration (m/s^2^)**, $S$ is **stride length (m)**, and $H$ is the **hip height (m)**. This method is based on empirical studies that model the relationship between stride length, body size, and speed for general terrestrial vertebrates. The coefficients $0.25$, $1.67$, and $-1.17$ have been derived from studies focused on scaling relationships in **bipedal and quadrupedal animals**.
**Method B** follows a similar approach but with a coefficient of $0.226$ instead of $0.25$, which provides a refinement for **bipedal locomotion**. The formula is (*Ruiz & Torices, 2013*):
$$
v = 0.226 \cdot \sqrt{G} \cdot S^{1.67} \cdot H^{-1.17}
$$
The **relative stride length** is calculated as the ratio between **stride length and hip height** ($S / H$), which allows distinguishing between different **gaits** according to *Thulborn & Wade (1984)*. The classification is as follows:
- **Walk:** Relative stride $< 2.0$ (locomotor performance equivalent to walking in mammals).\
- **Trot:** Relative stride $2.0 \leq A/H \leq 2.9$ (locomotor performance equivalent to trotting or racking in mammals).\
- **Run:** Relative stride $> 2.9$ (locomotor performance equivalent to cantering, galloping, or sprinting in mammals).
The function returns a **track `velocity object`**, which is structured as a **list of lists**, with each list representing an individual track. For each track, the output includes various metrics that describe the calculated velocities and relative stride lengths. Specifically, it provides a vector of calculated velocities for each step, referred to as **`Step_velocities`**, measured in meters per second (**m/s**). Additionally, the function calculates the **`Mean_velocity`**, which represents the average speed across all steps, as well as the **`Standard_deviation_velocity`**, which quantifies the variation in velocity measurements. The **`Maximum_velocity`** and **`Minimum_velocity`** indicate the highest and lowest calculated velocities, respectively. In terms of relative stride lengths, the function also provides a vector of calculated values known as **`Step_relative_stride`**. The average of these values is captured by the **`Mean_relative_stride`**, while their variation is described by the **`Standard_deviation_relative_stride`**. Moreover, the highest and lowest calculated relative stride lengths are denoted as **`Maximum_relative_stride`** and **`Minimum_relative_stride`**, respectively. This comprehensive output allows users to thoroughly assess the speed and locomotion style of the track-makers under study.
The function is particularly useful for estimating the **speed of ancient track-makers** from their footprints and evaluating their **locomotion style (walking, trotting, or running)**.
### **Examples of Usage**
Calculating velocities for the **Paluxy River dataset** using **Method A** for both tracks. The hip heights (`H_paluxyriver`) are provided for each trackmaker.
```{r, eval=FALSE}
H_paluxyriver <- c(3.472, 2.200)
velocity_paluxyriver <- velocity_track(PaluxyRiver, H = H_paluxyriver)
```
```{r echo=FALSE}
H_paluxyriver <- c(3.472, 2.200)
velocity_paluxyriver <- velocity_track(PaluxyRiver, H = H_paluxyriver)
```
Calculating velocities for the **Mount Tom dataset** using **Method A** for all tracks. Multiple hip heights (`H_mounttom`) are specified, corresponding to each track in the dataset.
```{r, eval=FALSE}
H_mounttom <- c(
1.380, 1.404, 1.320, 1.736, 1.364, 1.432, 1.508, 1.768, 1.600,
1.848, 1.532, 1.532, 0.760, 1.532, 1.688, 1.620, 0.636, 1.784, 1.676, 1.872,
1.648, 1.760, 1.612
)
velocity_mounttom <- velocity_track(MountTom, H = H_mounttom)
```
```{r echo=FALSE}
H_mounttom <- c(
1.380, 1.404, 1.320, 1.736, 1.364, 1.432, 1.508, 1.768, 1.600,
1.848, 1.532, 1.532, 0.760, 1.532, 1.688, 1.620, 0.636, 1.784, 1.676, 1.872,
1.648, 1.760, 1.612
)
velocity_mounttom <- velocity_track(MountTom, H = H_mounttom)
```
Comparing velocities for the **Paluxy River dataset** using **different methods**: **Method A** for the sauropod trackway and **Method B** for the theropod trackway. This demonstrates how to apply distinct calculation methods to different trackmakers within the same dataset.
```{r, eval=FALSE}
H_paluxyriver <- c(3.472, 2.200)
Method_paluxyriver <- c("A", "B")
velocity_paluxyriver_diff <- velocity_track(PaluxyRiver, H = H_paluxyriver, method = Method_paluxyriver)
```
```{r echo=FALSE}
H_paluxyriver <- c(3.472, 2.200)
Method_paluxyriver <- c("A", "B")
velocity_paluxyriver_diff <- velocity_track(PaluxyRiver, H = H_paluxyriver, method = Method_paluxyriver)
```
## **Plotting Velocity Data**
The **`plot_velocity()`** function provides a powerful visualization tool for examining trajectories colored by either **velocity** or **relative stride length**. By applying color gradients, it highlights how these parameters change along the paths of various tracks, providing valuable insights into locomotor dynamics.
The function takes as inputs a **`track` R object** and a **`track velocity` R object**, where the latter contains the calculated velocities and relative stride lengths for each track. The user can specify the parameter to be visualized via the **`param`** argument, choosing between **`"V"`** for velocity or **`"RSL"`** for relative stride length. If not specified, the function defaults to visualizing **velocity**.
The plotting process is handled by the **`ggplot2`** package, using **`ggplot2::geom_path()`** to plot the tracks and **`ggplot2::scale_color_gradientn()`** to apply a color gradient representing the selected parameter. Users can customize the color palette via the **`colours`** argument and adjust the line width with the **`lwd`** argument.
The function also allows the user to include or exclude a legend from the plot by setting the **`legend`** argument to **`TRUE`** or **`FALSE`**, respectively. This flexibility ensures that the plots can be tailored to the user's preferences and presentation requirements.
The resulting plot provides a visually appealing and informative representation of how **velocity** or **relative stride length** changes along each trajectory. Such plots are particularly useful for comparing the locomotor patterns of different track makers or assessing how environmental or anatomical factors influence movement.
### **Examples of Usage**
Plotting trajectories colored by **relative stride length (RSL)** for the **PaluxyRiver dataset** using the previously calculated `velocity_paluxyriver_diff` object.
```{r echo=TRUE}
plot_velocity(PaluxyRiver, velocity_paluxyriver_diff, param = "RSL")
```
Plotting trajectories colored by **velocity** for the **MountTom dataset** using the previously calculated `velocity_mounttom` object.
```{r echo=TRUE}
plot_velocity(MountTom, velocity_mounttom, param = "V")
```
Generating a clean visualization of **relative stride length (RSL)** for the **PaluxyRiver dataset** without displaying a legend.
```{r echo=TRUE}
plot_velocity(PaluxyRiver, velocity_paluxyriver_diff, param = "RSL", lwd = 1.5,
colours = c("purple", "orange", "pink", "gray"), legend = FALSE)
```
Applying **custom colors and increased line width** to enhance visualization of **velocity patterns** for the **MountTom dataset**.
```{r echo=TRUE}
plot_velocity(MountTom, velocity_mounttom, param = "V", lwd = 2,
colours = c("blue", "green", "yellow", "red"))
```
## **Plotting Direction Data**
The **`plot_direction()`** function provides a comprehensive approach to visualizing **direction data** from **`track` R objects**. It allows users to generate various types of plots to effectively compare and examine directionality within their datasets. The available plotting styles are highly customizable, making this function versatile for different types of directional analysis.
This function supports four primary **plotting styles**, specified by the **`plot_type`** argument. The **`"boxplot"`** option displays the distribution of step directions across tracks as boxplots, providing an overview of directionality variations by showing medians, quartiles, and potential outliers. The **`"polar_steps"`** option generates **polar histograms** that visualize the frequency of steps within various directional bins, making it particularly useful for examining the spread and density of step directions around a central point and highlighting dominant movement trends. The **`"polar_average"`** style also produces **polar histograms**, but it focuses on average directions per track rather than individual steps. This summarization approach offers a simplified comparison of overall trends across multiple tracks. Finally, the **`"faceted"`** option creates **faceted polar histograms** where each track is displayed separately within a grid of plots, providing a clear visual comparison of step directions across tracks and making it especially effective for analyzing individual trackmaker behaviors.
The **`plot_direction()`** function allows users to **customize visualizations** through several arguments. The **`angle_range`** argument controls the width of the bins used in **polar histograms**, allowing users to specify the desired **angular resolution**. The **`y_labels_position`** argument is useful for **positioning the labels of the y-axis**, especially in polar plots, to enhance clarity and presentation. Users can also provide **custom breaks for the y-axis** using the **`y_breaks_manual`** argument, which defines where the labels should be placed for better visualization of frequency data. This flexibility ensures that the user can **tailor the output** to suit specific analytical needs, whether examining **general trends**, comparing **individual tracks**, or highlighting particular aspects of **directional data**.
By generating high-quality visualizations as **`ggplot` R objects**, the **`plot_direction()`** function allows for further customization using additional functions from the **`ggplot2`** package. This integration makes it easy to enhance plots with annotations, themes, and other graphical elements.
The function is particularly useful for analyzing **trackway direction data**, providing valuable insights into **movement patterns**, **orientation preferences**, and **potential group behavior**.
### **Examples of Usage**
The **`boxplot`** option generates a summary of directional data distribution, highlighting central tendency, variability, and potential outliers across tracks.
```{r echo=TRUE, results='hide'}
plot_direction(PaluxyRiver, plot_type = "boxplot")
```
```{r echo=TRUE, results='hide'}
plot_direction(MountTom, plot_type = "boxplot")
```
The **`polar_steps`** option creates a polar histogram of individual steps radiating from a central point, revealing the angular spread of movement and dominant directions.
```{r echo=TRUE, results='hide'}
plot_direction(PaluxyRiver, plot_type = "polar_steps")
```
```{r echo=TRUE, results='hide'}
plot_direction(MountTom, plot_type = "polar_steps")
```
The **`polar_average`** option generates a simplified polar plot by averaging step directions for each track, providing a general overview of dominant movement trends.
```{r echo=TRUE, results='hide'}
plot_direction(PaluxyRiver, plot_type = "polar_average")
```
```{r echo=TRUE, results='hide'}
plot_direction(MountTom, plot_type = "polar_average")
```
The **`faceted`** option displays individual step directions separately for each track using faceted panels, allowing detailed comparison of movement patterns across multiple tracks.
```{r echo=TRUE, results='hide'}
plot_direction(PaluxyRiver, plot_type = "faceted")
```
```{r echo=TRUE, results='hide'}
plot_direction(MountTom, plot_type = "faceted")
```
Customization options include setting custom breaks on the radial axis with **`y_breaks_manual`** and adjusting the position of y-axis labels with **`y_labels_position`** for better presentation.
```{r echo=TRUE, results='hide'}
plot_direction(PaluxyRiver, plot_type = "polar_average", y_breaks_manual = c(1, 2))
```
```{r echo=TRUE, results='hide'}
plot_direction(PaluxyRiver, plot_type = "polar_steps", y_labels_position = -90)
```
## **Testing for Differences in Velocity**
The **`test_velocity()`** function evaluates differences in **velocities** across different tracks within a **`track` R object**. It provides three statistical methods, which can be selected using the `analysis` argument:: **ANOVA**, **Kruskal-Wallis test**, and **Generalized Linear Models (GLM)**, allowing users to compare velocity data and identify significant differences between tracks. The function also includes diagnostic tests to check assumptions of **normality** and **homogeneity of variances** before proceeding with the analysis. When more than two tracks are present, it performs **pairwise comparisons** to identify specific differences between tracks.
The **`test_velocity()`** function requires that each track contains more than **three footprints** to be included in the analysis. This is necessary because statistical tests for comparing mean velocities rely on having a sufficient number of data points to provide meaningful results. When a track contains only three or fewer footprints, the sample size is too small to accurately estimate mean velocity and its variability, making statistical comparisons unreliable. By setting this threshold, the function ensures that the results are **statistically robust and meaningful**.
The function accepts a **`track velocity` R object**, which is an output of the **`velocity_track()`** function. This object contains **calculated velocities** and other related parameters for each track, including **individual step velocities**, **mean velocities**, and **relative stride lengths**. This information serves as the input for the statistical comparisons performed by **`test_velocity()`**.
If **`"ANOVA"`** is selected, the function checks for **normality** (using the **Shapiro-Wilk test**) and **homogeneity of variances** (using **Levene’s test**). If assumptions are violated, it issues warnings suggesting the use of **`"Kruskal-Wallis"`** or **`"GLM"`** instead. **`"ANOVA"`** compares mean velocities across tracks, and if significant differences are detected, **Tukey’s HSD** is used for **post-hoc pairwise comparisons**. When **`"Kruskal-Wallis"`** is chosen, the function performs a **non-parametric test** that compares **median velocities** across tracks. If significant differences are detected, **Dunn's test** is used for **post-hoc pairwise comparisons**. If **`"GLM"`** is specified, the function uses a **Generalized Linear Model (GLM)** with a **Gaussian family** to compare **mean velocities** across tracks. **Pairwise comparisons** are conducted using the **`emmeans` package**, which computes differences between group means and adjusts for multiple comparisons using **Tukey’s method**. This approach is useful when the data does not meet the assumptions of **ANOVA** but still requires a parametric approach.
If the argument **`plot = TRUE`** is specified, a **boxplot of velocities by track** is generated for visual comparison of **velocity distributions** across tracks. The boxplot allows the user to visually assess differences in **central tendency** and **variability** across tracks, complementing the statistical analyses.
The function returns a list of results that includes: **`normality_results`**, a matrix containing the test statistic and *p*-value for the **Shapiro-Wilk normality test** for each track; **`homogeneity_test`**, the result of **Levene's test**, including the *p*-value for testing **homogeneity of variances** across tracks; **`ANOVA`**, if selected, containing the **ANOVA table** and **Tukey HSD** post-hoc results; **`Kruskal_Wallis`**, if selected, containing the **Kruskal-Wallis test result** and **Dunn's test** post-hoc results; **`GLM`**, if selected, providing a summary of the **GLM fit** and pairwise comparisons from the **`emmeans` package**; and finally, the **`plot`** if requested, displaying a **boxplot of velocities by track**.
### **Examples of Usage**
The **ANOVA** method is suitable for comparing mean velocities when data meet the assumptions of normality and homogeneity of variances.
```{r, eval=FALSE}
test_velocity(PaluxyRiver, velocity_paluxyriver_diff, analysis = "ANOVA")
```
```{r echo=FALSE}
test_velocity(PaluxyRiver, velocity_paluxyriver_diff, analysis = "ANOVA")
```
```{r, eval=FALSE}
test_velocity(MountTom, velocity_mounttom, analysis = "ANOVA")
```
```{r echo=FALSE}
test_velocity(MountTom, velocity_mounttom, analysis = "ANOVA")
```
The **Kruskal-Wallis** test is a non-parametric method that compares median velocities, useful when normality or homogeneity of variances cannot be assumed.
```{r, eval=FALSE}
test_velocity(PaluxyRiver, velocity_paluxyriver_diff, analysis = "Kruskal-Wallis")
```
```{r echo=FALSE}
test_velocity(PaluxyRiver, velocity_paluxyriver_diff, analysis = "Kruskal-Wallis")
```
```{r, eval=FALSE}
test_velocity(MountTom, velocity_mounttom, analysis = "Kruskal-Wallis")
```
```{r echo=FALSE}
test_velocity(MountTom, velocity_mounttom, analysis = "Kruskal-Wallis")
```
The **GLM** approach offers a flexible alternative when data do not meet the assumptions required for ANOVA. This method fits a linear model with a Gaussian family and performs pairwise comparisons using `emmeans`.
```{r, eval=FALSE}
test_velocity(PaluxyRiver, velocity_paluxyriver_diff, analysis = "GLM")
```
```{r echo=FALSE}
test_velocity(PaluxyRiver, velocity_paluxyriver_diff, analysis = "GLM")
```
```{r, eval=FALSE}
test_velocity(MountTom, velocity_mounttom, analysis = "GLM")
```
```{r echo=FALSE}
test_velocity(MountTom, velocity_mounttom, analysis = "GLM")
```
## **Testing for Acceleration, Deceleration, or Steady Movement along Trajectories**
The **`mode_velocity()`** function evaluates whether a track maker is **accelerating**, **decelerating**, or maintaining a **steady speed** along its trajectory by applying **Spearman’s rank correlation test**. This test is particularly suitable for analyzing trends in velocity because it does not assume normality or linearity in the relationship between step number and velocity. Instead, it detects **monotonic relationships based on ranks**, making it robust to **outliers** and effective for identifying general trends.
The function accepts a **`track velocity` R object** and processes each trajectory separately. For each trajectory, the function calculates the **Spearman correlation coefficient** and its associated *p*-value, comparing **velocity values** to their corresponding **step numbers**. If the *p*-value is less than **0.05**, the trend is classified as **“acceleration”** if the correlation coefficient is positive or **“deceleration”** if it is negative. If the *p*-value is greater than or equal to **0.05**, the trend is labeled as **“steady,”** indicating no significant monotonic relationship between **velocity and step number**. This approach allows the detection of gradual changes in velocity over the course of a track, which may reflect shifts in **locomotor performance or behavior**.
Trajectories with fewer than **three steps** are considered insufficient for reliable statistical analysis, and the function returns a message indicating that the data is inadequate for **correlation analysis**. This is because the calculation of a meaningful correlation requires a minimum of **three data points**.
The **`mode_velocity()`** function provides a straightforward way to classify **velocity trends** based on statistical significance, making it a useful tool for examining how track makers **modulate their speed** along their paths. However, it only identifies **monotonic trends** and may not detect more complex, **non-monotonic changes in speed**. Furthermore, the classification is **qualitative**, providing information about the general nature of the trend rather than quantifying the rate of change.
This approach draws from established **non-parametric statistical techniques** for measuring association between variables. The robustness of the method to **non-normal data** and its resistance to **outliers** makes it well-suited for **paleontological and biomechanical applications** where data quality and quantity can be limited.
### **Examples of Usage**
The **`velocity_paluxyriver_diff`** object contains calculated velocities for the **PaluxyRiver dataset** with different methods applied to the sauropod (Method A) and theropod (Method B).
```{r, eval=FALSE}
mode_velocity(velocity_paluxyriver_diff)
```
```{r echo=FALSE}
mode_velocity(velocity_paluxyriver_diff)
```
The **`velocity_mounttom`** object contains calculated velocities for the **MountTom dataset**.
```{r, eval=FALSE}
mode_velocity(velocity_mounttom)
```
```{r echo=FALSE}
mode_velocity(velocity_mounttom)
```
## **Testing for Differences in Direction**
The **`test_direction()`** function provides a powerful statistical framework for comparing **directions** across different tracks within a **`track` R object**. It offers three distinct methods for this purpose, which can be selected using the **`analysis`** argument: **ANOVA**, **Kruskal-Wallis test**, and **Generalized Linear Models (GLM)**, ensuring robust analysis even when assumptions about data distribution and variance homogeneity are violated. The function requires that each track contains more than **three footprints** to be included in the analysis, as statistical tests for comparing mean directions require a sufficient sample size to generate meaningful results. This threshold ensures that the comparisons are statistically reliable and robust.
When using the **`"ANOVA"`** option, the function first checks the **normality of the data** through the **Shapiro-Wilk test** and assesses **homogeneity of variances** using **Levene’s test**. If these assumptions are violated, the user is advised to consider the **Kruskal-Wallis** or **GLM** methods instead. The **ANOVA** method compares **mean directions across tracks**, and if significant differences are detected, **Tukey’s HSD test** is applied to perform **post-hoc pairwise comparisons**. The **`"Kruskal-Wallis"`** option, in contrast, offers a non-parametric approach that compares **median directions** across tracks and applies **Dunn's test** for pairwise comparisons if significant differences are found. This method is particularly suitable when data do not meet the assumptions required for parametric tests. Alternatively, if the **`"GLM"`** option is selected, the function applies a **Generalized Linear Model (GLM)** with a **Gaussian family** to compare **mean directions**. The **`emmeans` package** is then used to compute **pairwise comparisons**, with adjustments for multiple comparisons following **Tukey’s method**, offering a flexible approach when dealing with complex data distributions or deviations from normality.
The **`test_direction()`** function returns list with a detailed set of results. It provides a **normality results matrix** containing the **Shapiro-Wilk test statistic and *p*-value** for each track, allowing the user to evaluate whether the data follows a normal distribution. The function also outputs the result of **Levene’s test**, including the **p-value** used to assess whether variances across tracks are homogeneous. When the **ANOVA** method is selected, the function delivers the **ANOVA table** along with **Tukey HSD post-hoc results**, enabling a thorough examination of differences between groups. For the **Kruskal-Wallis** test, the function returns the overall test result alongside **Dunn’s test post-hoc comparisons**, providing a non-parametric alternative for analyzing differences in central tendency. In the case of **GLM**, the output includes a summary of the model fit along with the **pairwise comparisons** calculated via the **`emmeans` package**, providing an efficient method to detect significant differences while accommodating more complex statistical models.
The flexibility and comprehensiveness of the **`test_direction()`** function make it a versatile tool for comparing **directional data** across multiple tracks. Its ability to perform **parametric, non-parametric, and generalized linear model analyses** ensures that researchers can robustly test hypotheses related to **movement patterns, group behavior, and ecological interactions**, regardless of the underlying data structure.
### **Examples of Usage**
Testing with **ANOVA** checks for differences in **mean directions** across tracks, assuming the data is normally distributed and variances are homogeneous. Post-hoc pairwise comparisons are conducted if significant differences are found.
```{r, eval=FALSE}
test_direction(PaluxyRiver, analysis = "ANOVA")
```
```{r include=FALSE}
test_dir_paluxy_anova <- test_direction(PaluxyRiver, analysis = "ANOVA")
```
```{r, echo=FALSE}
print(test_dir_paluxy_anova)
```
```{r, eval=FALSE}
test_direction(MountTom, analysis = "ANOVA")
```
```{r include=FALSE}
test_dir_mount_anova <- test_direction(MountTom, analysis = "ANOVA")
```
```{r, echo=FALSE}
print(test_dir_mount_anova)
```
Testing with **Kruskal-Wallis** provides a non-parametric alternative for comparing **median directions** across tracks when the data does not meet the assumptions of ANOVA. Significant differences are further examined using Dunn’s test.
```{r, eval=FALSE}
test_direction(PaluxyRiver, analysis = "Kruskal-Wallis")
```
```{r include=FALSE}
test_dir_paluxy_Kruskal <- test_direction(PaluxyRiver, analysis = "Kruskal-Wallis")
```
```{r, echo=FALSE}
print(test_dir_paluxy_Kruskal)
```
```{r, eval=FALSE}
test_direction(MountTom, analysis = "Kruskal-Wallis")
```
```{r include=FALSE}
test_dir_mount_Kruskal <- test_direction(MountTom, analysis = "Kruskal-Wallis")
```
```{r, echo=FALSE}
print(test_dir_mount_Kruskal)
```
Testing with **Generalized Linear Model (GLM)** offers a flexible approach for comparing **mean directions** even when data distribution assumptions are violated. Pairwise comparisons are computed using the `emmeans` package.
```{r, eval=FALSE}
test_direction(PaluxyRiver, analysis = "GLM")
```
```{r include=FALSE}
test_dir_paluxy_GLM <- test_direction(PaluxyRiver, analysis = "GLM")
```
```{r, echo=FALSE}
print(test_dir_paluxy_GLM)
```
```{r, eval=FALSE}
test_direction(MountTom, analysis = "GLM")
```
```{r include=FALSE}
test_dir_mount_GLM <- test_direction(MountTom, analysis = "GLM")
```
```{r, echo=FALSE}
print(test_dir_mount_GLM)
```
## **Simulating Tracks Using Different Movement Models**
The **`simulate_track()`** function generates **simulated movement trajectories** based on an existing set of tracks. It offers three distinct movement models—**Directed**, **Constrained**, and **Unconstrained**—each representing varying levels of constraint in movement patterns. This flexibility allows users to model scenarios reflecting **biological or environmental constraints**, such as directed movement towards resources, movement along geographical features, or free exploratory behavior.
The **`Directed`** model is the most constrained, simulating trajectories that follow a specific direction based on the original track's **angular orientation**. It aims to maintain a consistent overall direction with minor deviations to reflect natural variability. This model is useful for scenarios where movement is **highly directional**, such as an animal navigating toward a known resource.
The **`Constrained`** model represents a **correlated random walk**, where movement is not strictly directional but is influenced by certain **angular and linear properties** of the original track. This model allows for random exploration while retaining some of the characteristics of the original movement pattern. It is suitable for situations where animals navigate with **limited knowledge of their surroundings**, such as navigating within a **bounded area without a specific destination**.
The **`Unconstrained`** model provides the most flexibility, simulating trajectories that represent **exploratory or dispersal behavior without directional bias**. It is based on a **fully random walk** with the starting direction randomly determined for each simulation. This model is appropriate for scenarios where animals are moving through **unfamiliar or homogeneous environments**.
The **`simulate_track()`** function requires several arguments to define the simulation process. The **`data`** argument specifies the input as a `track` R object. The **`nsim`** argument defines the number of simulations to run, with a default value of **1000** if not specified. This number represents a **recommended minimum** to ensure stable and reliable estimation of *p*-values, particularly when assessing the statistical significance of observed trajectory features. Using fewer simulations may result in imprecise or biased *p*-value estimates due to insufficient sampling of the null distribution. The **`model`** argument determines the type of movement model to be used, which can be **"Directed"**, **"Constrained"**, or **"Unconstrained"**, with the default being **"Unconstrained"** if not provided.
The function ensures that simulations are only applied to **trajectories with at least four steps**, as calculating the **standard deviations of angles and step lengths**—essential for the simulation process—is not feasible for shorter trajectories. In cases where tracks are too short, the user is advised to use the **`subset_track()`** function to exclude those tracks from the simulation process.
The **`simulate_track()`** function returns a **`track simulation` R object**, which is a list of simulated trajectories. Each simulation is stored separately, providing users with the flexibility to **visualize, analyze, and compare these simulations** against the original tracks to evaluate **movement constraints** or explore **alternative movement hypotheses**. This approach is particularly useful for testing **paleoecological and paleoethological hypotheses**, such as assessing whether observed movement patterns are consistent with group behavior, resource-driven navigation, or independent exploratory movement.
### **Examples of Usage**
Simulating tracks from the **Paluxy River** and **Mount Tom** datasets using the **Unconstrained**, **Directed**, and **Constrained** models. For the **Mount Tom** dataset, tracks were preprocessed using **`subset_track()`** to retain only those containing at least four steps. Although the simulations were performed with **100 simulated tracks**, the examples below display **only a subset of 10 simulated tracks** to prevent the vignette from becoming excessively large. This approach ensures the vignette remains efficient, clear, and easy to navigate.
```{r, eval=FALSE}
sim_unconstrained_paluxy <- simulate_track(PaluxyRiver, nsim = 100, model = "Unconstrained")
print(sim_unconstrained_paluxy[1:10])
```
```{r echo=FALSE}
sim_unconstrained_paluxy <- simulate_track(PaluxyRiver, nsim = 100, model = "Unconstrained")
print(sim_unconstrained_paluxy[1:10])
```
```{r, eval=FALSE}
sim_directed_paluxy <- simulate_track(PaluxyRiver, nsim = 100, model = "Directed")
print(sim_directed_paluxy[1:10])
```
```{r echo=FALSE}
sim_directed_paluxy <- simulate_track(PaluxyRiver, nsim = 100, model = "Directed")
print(sim_directed_paluxy[1:10])
```
```{r, eval=FALSE}
sim_constrained_paluxy <- simulate_track(PaluxyRiver, nsim = 100, model = "Constrained")
print(sim_constrained_paluxy[1:10])
```
```{r echo=FALSE}
sim_constrained_paluxy <- simulate_track(PaluxyRiver, nsim = 100, model = "Constrained")
print(sim_constrained_paluxy[1:10])
```
```{r, eval=FALSE}
sim_unconstrained_mount <- simulate_track(sbMountTom, nsim = 100, model = "Unconstrained")
print(sim_unconstrained_mount[1:10])
```
```{r echo=FALSE}
sim_unconstrained_mount <- simulate_track(sbMountTom, nsim = 100, model = "Unconstrained")
print(sim_unconstrained_mount[1:10])
```
```{r, eval=FALSE}
sim_directed_mount <- simulate_track(sbMountTom, nsim = 100, model = "Directed")
print(sim_directed_mount[1:10])
```
```{r echo=FALSE}
sim_directed_mount <- simulate_track(sbMountTom, nsim = 100, model = "Directed")
print(sim_directed_mount[1:10])
```
```{r, eval=FALSE}
sim_constrained_mount <- simulate_track(sbMountTom, nsim = 100, model = "Constrained")
print(sim_constrained_mount[1:10])
```
```{r echo=FALSE}
sim_constrained_mount <- simulate_track(sbMountTom, nsim = 100, model = "Constrained")
print(sim_constrained_mount[1:10])
```
## **Plotting Simulated and Actual Tracks**
The **`plot_sim()`** function provides a **visual comparison** between **simulated movement trajectories** and the **actual observed tracks**, allowing users to evaluate how closely the simulations replicate **real movement patterns**. The function requires two main inputs: a **`track` R object** representing the original track data and a **`track simulation` R object** generated by the **`simulate_track()`** function, which contains the simulated trajectories to be compared against the original tracks.
The function relies on the **`ggplot2`** package to create plots with two primary components: **simulated and actual trajectories**. **Simulated trajectories** are displayed with paths colored according to the user-specified colors via the **`colours_sim`** argument. The user can also adjust **transparency** and **line width** through the **`alpha_sim`** and **`lwd_sim`** arguments. The default color for simulated tracks is **black**, and the default transparency level is set to a low value (**`0.1`**) to avoid visual clutter when many simulated tracks are plotted. The **original trajectories** are plotted on top of the simulated trajectories using the colors specified by the **`colours_act`** argument. Users can also adjust **transparency** (**`alpha_act`**) and **line width** (**`lwd_act`**) to enhance visibility. This overlay helps in visually comparing the **similarity between actual and simulated tracks**.
The function **returns a `ggplot` R object** that overlays the original and simulated tracks, allowing for further customization using additional **`ggplot2`** functions if desired. This visualization tool is essential for assessing whether the chosen simulation model (**Directed**, **Constrained**, or **Unconstrained**) adequately captures the **movement dynamics** represented by the original tracks and for comparing how well a specific model replicates the original track patterns under various conditions.
### **Examples of Usage**
Visualizing the **Unconstrained simulation for the Paluxy River dataset** using default settings.
This plot provides a basic comparison of the simulated tracks over the original tracks.
```{r echo=TRUE}
plot_sim(PaluxyRiver, sim_unconstrained_paluxy)
```
Visualizing the **Directed simulation for the Paluxy River dataset** with customized colors, transparency, and line width. Simulated tracks are plotted using light blue and orange colors with moderate transparency, making them easily distinguishable from the actual tracks plotted in black.
```{r echo=TRUE}
plot_sim(PaluxyRiver, sim_directed_paluxy,
colours_sim = c("#E69F00", "#56B4E9"),
alpha_sim = 0.4, lwd_sim = 1,
colours_act = c("black", "black"), alpha_act = 0.7, lwd_act = 2
)
```
Visualizing the **Constrained simulation for the Paluxy River dataset** with enhanced transparency and a reduced line width for the simulated tracks. This approach highlights the actual tracks more clearly, emphasizing the contrast between real and simulated data.
```{r echo=TRUE}
plot_sim(PaluxyRiver, sim_constrained_paluxy,
colours_sim = c("#E69F00", "#56B4E9"),
alpha_sim = 0.6, lwd_sim = 0.1,
alpha_act = 0.5, lwd_act = 2
)
```
Visualizing the **Unconstrained simulation for the MountTom dataset** using default settings.
This plot provides a straightforward comparison of the simulated tracks over the actual tracks without any customization.
```{r echo=TRUE}
plot_sim(sbMountTom, sim_unconstrained_mount)
```
Visualizing the **Directed simulation for the MountTom dataset** with a custom color palette, higher transparency, and thicker lines for simulated tracks. This setup allows for better differentiation between the various simulated tracks and the original ones.
```{r echo=TRUE}
plot_sim(sbMountTom, sim_directed_mount,
colours_sim = c("#6BAED6", "#FF7F00", "#1F77B4", "#D62728",
"#2CA02C", "#9467BD", "#8C564B", "#E377C2",
"#7F7F7F", "#BCBD22", "#17BECF"),
alpha_sim = 0.3, lwd_sim = 1.5,
alpha_act = 0.8, lwd_act = 2
)
```
Visualizing the **Constrained simulation for the MountTom dataset** with a diverse color palette and thinner simulated tracks for better clarity. The high transparency of simulated tracks ensures the original tracks remain clearly visible.
```{r echo=TRUE}
plot_sim(sbMountTom, sim_constrained_mount,
colours_sim = c("#6BAED6", "#FF7F00", "#1F77B4", "#D62728",
"#2CA02C", "#9467BD", "#8C564B", "#E377C2",
"#7F7F7F", "#BCBD22", "#17BECF"),
alpha_sim = 0.5, lwd_sim = 0.2,
alpha_act = 0.6, lwd_act = 2
)
```
## **Similarity Metric Calculation using Dynamic Time Warping (DTW) and Fréchet Distance**
The **`simil_DTW_metric()`** and **`simil_Frechet_metric()`** functions provide robust tools for comparing **movement trajectories** by quantifying their similarity through distinct mathematical approaches. Both functions operate on a **`track` R object**, allowing users to evaluate whether observed patterns deviate from random expectations by comparing real tracks to simulated ones.
The **`simil_DTW_metric()`** function applies the **Dynamic Time Warping (DTW)** algorithm, which is especially suitable for analyzing **animal movement patterns** with **temporal distortions or varying lengths**. By minimizing the **cumulative distance between corresponding points**, it effectively aligns sequences even when timing or pacing differs between trajectories. The calculation uses the **`dtw::dtw()`** function from the **`dtw`** package, employing **Euclidean distance** to compute local distances between points. This method excels at assessing how well simulated tracks replicate real movements, although it can be sensitive to noise and outliers, particularly when comparing trajectories of different lengths.
In contrast, the **`simil_Frechet_metric()`** function calculates similarity based on the **Fréchet distance**, a metric that evaluates the **overall shape of trajectories** by considering both the **order and location of points**. Unlike DTW, which focuses on pointwise alignment, the Fréchet distance measures similarity by assessing how closely two paths follow each other over their entire length. This approach is often illustrated by the analogy of a **person walking a dog on a leash**, where both can adjust their speed independently but must remain connected. It is particularly effective for comparing trajectories where the overall path shape matters, such as **migration routes, travel paths, or animal tracks**. However, because the Fréchet distance evaluates all points along the path, this method is particularly **sensitive to noise** and may return an invalid measurement (**-1**) when comparing highly disparate tracks, especially those generated under fully **Unconstrained models**. It is essential for users to **check for the presence of -1 values** in the resulting **`track similarity` R object**, as these invalid measurements cannot be used in further analyses. Ignoring this warning may lead to incorrect conclusions or errors when processing the results.
Both functions include the **`superposition`** argument, which provides options for aligning trajectories before calculating similarity metrics: **`"None"`**, **`"Centroid"`**, and **`"Origin"`**. The **`"None"`** option compares trajectories as they are, while **`"Centroid"`** alignment shifts trajectories to their centroids, eliminating positional differences while preserving shape. The **`"Origin"`** option aligns trajectories to their starting points, making comparisons independent of absolute positions. These alignment methods ensure comparisons are based on **movement patterns** rather than arbitrary spatial differences.
Statistical testing is supported when **`test = TRUE`**, requiring a **`track simulation` R object** to be provided through the **`sim`** argument. This allows users to compare similarity metrics between real tracks and simulated trajectories, determining whether observed similarities are significantly greater than those expected under random conditions. The functions provide two types of ***p*-values**: **pairwise *p*-values**, which represent the proportion of simulated distances smaller than the observed distance for each trajectory pair, and **combined *p*-values**, which reflect the overall proportion of simulations where all observed distances are smaller than the simulated ones. This statistical framework offers a comprehensive approach for evaluating similarity metrics and testing hypotheses about track similarity.
The **`simil_DTW_metric()`** and **`simil_Frechet_metric()`** functions return a **`track similarity` R object**, which contains valuable information about the similarity between trajectories. This object includes a matrix called **`DTW_distance_metric`** (for the DTW metric) or **`Frechet_distance_metric`** (for the Fréchet metric), which stores the pairwise distances calculated between all trajectories. If the **`test`** argument is set to **`TRUE`**, additional components are included in the output. Specifically, a matrix of **`DTW_distance_metric_p_values`** or **`Frechet_distance_metric_p_values`** is provided, containing the *p*-values for the pairwise distances obtained by comparing the observed tracks to those generated through simulations. The **`DTW_metric_p_values_combined`** and **`Frechet_metric_p_values_combined`** elements report the overall *p*-value, summarizing the significance of the similarity between all tracks when considering the simulated datasets. Moreover, the output includes **`DTW_distance_metric_simulations`** or **`Frechet_distance_metric_simulations`**, a list of matrices with the similarity metrics calculated from each simulated dataset. This comprehensive output structure allows users to thoroughly evaluate whether the observed similarity between trajectories is greater than expected under random conditions, providing robust statistical support for their analyses.
The combination of flexible alignment options, statistical testing, and compatibility with **simulated datasets** makes these functions highly valuable for investigating **animal movement patterns**, testing hypotheses about **track similarity**, and assessing the accuracy of simulations.
### **Examples of Usage**
Comparing **Paluxy River** tracks against **Directed model simulations** using **Centroid superposition**.
```{r, eval=FALSE}
simil_dtw_directed_paluxy <- simil_DTW_metric(PaluxyRiver, test = TRUE,
sim = sim_directed_paluxy,
superposition = "Centroid")
```
```{r include=FALSE}
simil_dtw_directed_paluxy <- simil_DTW_metric(PaluxyRiver, test = TRUE,
sim = sim_directed_paluxy,
superposition = "Centroid")
```
```{r, echo=FALSE}
print(simil_dtw_directed_paluxy)
```
```{r, eval=FALSE}
simil_frechet_directed_paluxy <- simil_Frechet_metric(PaluxyRiver, test = TRUE,
sim = sim_directed_paluxy,
superposition = "Centroid")
```
```{r include=FALSE}
simil_frechet_directed_paluxy <- simil_Frechet_metric(PaluxyRiver, test = TRUE,
sim = sim_directed_paluxy,
superposition = "Centroid")
```
```{r, echo=FALSE}
print(simil_frechet_directed_paluxy)
```
Comparing **MountTom** tracks (after subsetting) against **Constrained model simulations** using **Origin superposition**.
```{r, eval=FALSE}
simil_dtw_constrained_mount <- simil_DTW_metric(sbMountTom, test = TRUE,
sim = sim_constrained_mount,
superposition = "Origin")
```
```{r include=FALSE}
simil_dtw_constrained_mount <- simil_DTW_metric(sbMountTom, test = TRUE,
sim = sim_constrained_mount,
superposition = "Origin")
```
```{r, echo=FALSE}
print(simil_dtw_constrained_mount)
```
```{r, eval=FALSE}
simil_frechet_constrained_mount <- simil_Frechet_metric(sbMountTom, test = TRUE,
sim = sim_constrained_mount,
superposition = "Origin")
```
```{r include=FALSE}
simil_frechet_constrained_mount <- simil_Frechet_metric(sbMountTom, test = TRUE,
sim = sim_constrained_mount,
superposition = "Origin")
```
```{r, echo=FALSE}
print(simil_frechet_constrained_mount)
```
The resulting **`track similarity objects`** can then be explored to check **pairwise distances**, ***p*-values**, and **combined *p*-values**. It is important to verify the presence of invalid measurements (**-1**) in the results, especially when working with highly disparate tracks generated by **Unconstrained models**.
## **Intersection Metric Calculation for Tracks**
The **`track_intersection()`** function is designed to detect and quantify **unique intersections** between real **movement trajectories**, helping to identify behavioral interactions such as **coordinated movement**, **chasing**, or **random exploration**. By comparing actual tracks against simulated trajectories generated with the **`simulate_track()`** function, users can statistically assess whether observed intersections occur more or less frequently than expected under random conditions.
The input to this function is a **`track` R object**. If statistical testing is desired, **`test`** argument must be set **`TRUE`** and a **`track simulation` R object** provided via the **`sim`** argument.
The function offers several options for modifying the **starting positions of simulated tracks**, controlled by the **`origin.permutation`** argument. When set to **`"None"`**, simulated trajectories retain their original starting points, making them comparable to the actual data without spatial modification. The **`"Min.Box"`** option randomly places the starting points of simulated tracks within the **minimum bounding box** surrounding all original starting points, simulating movement within a defined area. The **`"Conv.Hull"`** option provides a more precise alternative by placing simulated tracks within the **convex hull** encompassing all original starting points, reflecting the actual space occupied by the tracks. The **`"Custom"`** option allows users to specify an area of interest by providing a set of coordinates (**`custom.coord`**) defining the region's vertices, which is particularly useful when prior knowledge of terrain features or environmental factors suggests that movement may be spatially constrained.
The **`H1`** argument specifies the **alternative hypothesis** to be tested. When set to **`"Lower"`**, the function evaluates whether the observed intersections are significantly fewer than those generated by simulations, which may indicate **coordinated or gregarious movement**. When set to **`"Higher"`**, it tests whether the observed intersections are significantly greater, which could be indicative of **predatory or chasing interactions**. Users must provide a value for **`H1`** when the **`test`** argument is set to **`TRUE`**.
The output of the **`track_intersection()`** function is a **`track intersection` R object** that provides a comprehensive assessment of how tracks intersect and, when applicable, how these intersections compare to simulated datasets. The core component of this object is the **`Intersection_metric`**, a matrix that details the number of unique intersection points between pairs of trajectories. When **`test = TRUE`**, which requires a **`track simulation` R object** generated via **`simulate_track()`** provided through the **`sim`** argument, the function produces additional outputs aimed at statistically evaluating the observed intersections. One of these is the **`Intersection_metric_p_values`** matrix, which compares the observed intersections to those derived from simulations. Each *p*-value in this matrix indicates the proportion of simulated datasets with intersection counts as extreme or more extreme than the observed ones, depending on the selected **`H1`** hypothesis (whether intersections are expected to be fewer or more frequent than random expectation). The function also returns the **`Intersection_metric_p_values_combined`**, a single value that summarizes the overall statistical significance of the intersection metrics across all trajectory pairs. This combined *p*-value provides a general indication of whether the total number of intersections observed is significantly different from those generated through simulations, either supporting or rejecting the hypothesis under investigation. Additionally, the function provides the **`Intersection_metric_simulations`**, a list containing matrices of intersection counts for each simulation iteration. This detailed output allows users to explore how the number of intersections varies across multiple randomized scenarios, providing deeper insights into the robustness and consistency of the observed patterns. By comparing these results to simulated tracks, researchers can better evaluate whether the observed intersections are genuinely meaningful or simply the result of random chance.
### **Examples of Usage**
Detecting intersections between **Paluxy River** tracks and simulated tracks generated using the **Directed model**, without modifying the starting positions, and testing for **reduced intersections**.
```{r, eval=FALSE}
int_directed_paluxy <- track_intersection(PaluxyRiver, test = TRUE, H1 = "Lower",
sim = sim_directed_paluxy,
origin.permutation = "None")
print(int_directed_paluxy)
```
```{r include=FALSE}
int_directed_paluxy <- track_intersection(PaluxyRiver, test = TRUE, H1 = "Lower",
sim = sim_directed_paluxy,
origin.permutation = "None")
```
```{r, echo=FALSE}
print(int_directed_paluxy)
```
Detecting intersections between **MountTom** tracks (after subsetting) and simulated tracks generated using the **Constrained model**, with **convex hull permutation** of starting points, and testing for **increased intersections**.
```{r, eval=FALSE}
int_constrained_mount <- track_intersection(sbMountTom, test = TRUE, H1 = "Higher",
sim = sim_constrained_mount,
origin.permutation = "Conv.Hull")
print(int_constrained_mount)
```
```{r include=FALSE}
int_constrained_mount <- track_intersection(sbMountTom, test = TRUE, H1 = "Higher",
sim = sim_constrained_mount,
origin.permutation = "Conv.Hull")
```
```{r, echo=FALSE}
print(int_constrained_mount)
```
## **Combining *P*-values from Multiple Similarity and Intersection Metrics**
The **`combined_prob()`** function is designed to **combine *p*-values** obtained from various **similarity and intersection metrics** to provide a more comprehensive assessment of how well the **observed trajectories compare to those generated through simulation**. The function is useful when multiple metrics are applied to the same set of data, as it allows users to **integrate the results and obtain an overall measure of significance**.
The basic idea behind this function is to assess how the **observed similarity or intersection metrics** compare to those generated under various **simulated scenarios**. It starts by receiving a **list of `track similarity` and/or `track intersection` R objects**, obtained from **`simil_DTW_metric()`**, **`simil_Frechet_metric()`**, and/or **`track_intersection()` functions**. To make meaningful comparisons, the function ensures that all metrics provided in the **`metrics` list** have been derived using the same number of simulations. If the metrics are incompatible, the function raises an error to prevent incorrect calculations. The process works by generating a **matrix of *p*-values** where each entry represents the **combined significance of multiple metrics for a particular trajectory pair**. The function also provides an **overall *p*-value** that summarizes the combined significance across all **trajectory pairs**. This value indicates how consistently the **observed data differs from the simulated datasets** across all metrics.
The result is returned as a **list** consisting of two main components. The first component is a **matrix of combined *p*-values for each trajectory pair**, where the entries indicate the **probability of observing the combined metrics across all simulations**. This matrix provides a detailed comparison of how **individual track pairs** compare to the simulated datasets. The second component is a **single overall *p*-value** that summarizes the **combined significance of all metrics across all pairs of trajectories**, offering a **comprehensive assessment of the dataset as a whole**.
The **`combined_prob()`** function provides an efficient way to **aggregate multiple similarity or intersection metrics**, making it an essential tool for users who wish to **comprehensively evaluate their models and hypotheses**. By allowing for the integration of different metrics, the function offers a **holistic approach to analyzing and comparing track data**.
### **Examples of Usage**
Combining *p*-values for the **Paluxy River** dataset based on analyses performed with the **Directed model simulations**. In this example, we combine results from three different methods: **Dynamic Time Warping (DTW)**, **Fréchet distance**, and **intersection counts**. All of them were computed using the same simulated dataset (`sim_directed_paluxy`) and are stored in the objects `simil_dtw_directed_paluxy`, `simil_frechet_directed_paluxy`, and `int_directed_paluxy`.
```{r, eval=FALSE}
combined_metrics_paluxy <- combined_prob(PaluxyRiver, metrics = list(
simil_dtw_directed_paluxy,
simil_frechet_directed_paluxy,
int_directed_paluxy
))
```
```{r include=FALSE}
combined_metrics_paluxy <- combined_prob(PaluxyRiver, metrics = list(
simil_dtw_directed_paluxy,
simil_frechet_directed_paluxy,
int_directed_paluxy
))
```
```{r echo=FALSE}
print(combined_metrics_paluxy)
```
## **Clustering Tracks Based on Movement Parameters**
**Clustering tracks** is a powerful approach for detecting **patterns of movement** that may reveal specific **behaviors or strategies employed by trackmakers**. The **`cluster_track()`** function is designed to process a set of tracks by **calculating various movement parameters** and then classifying them into groups based on their **similarities**. Instead of just comparing tracks pairwise, this method allows for the identification of broader **behavioral patterns shared across multiple tracks**.
The process begins with **preparing the data**, where tracks containing **fewer than four steps** are filtered out. Short tracks lack sufficient information about movement patterns and would introduce noise into the clustering process. Once filtered, the function calculates multiple **movement parameters** to characterize the trackmaker’s behavior. These parameters include metrics such as **turning angles**, **step lengths**, **sinuosity**, **straightness**, and **velocity**.
The **`cluster_track()`** function uses the **`mclust` package** to perform **model-based clustering**, which is particularly effective because it identifies **natural clusters** by fitting various **Gaussian mixture models** to the data. Unlike methods that impose arbitrary groupings, **model-based clustering** selects the most suitable model by optimizing the **Bayesian Information Criterion (BIC)**. This approach is versatile, supporting **single-parameter and multi-parameter clustering**, depending on the user’s analytical needs. When **only one parameter** is selected, the function uses simpler models with **equal or variable variance**. When **multiple parameters** are specified, it examines the entire range of **Gaussian models** provided by **`mclust`**.
The parameters used for clustering are specified via the **`variables`** argument. Users can choose from a variety of options, including **`"TurnAng"`** (mean turning angle), **`"sdTurnAng"`** (standard deviation of turning angles), **`"Distance"`** (total distance covered), **`"Length"`** (total length of the trajectory), **`"StLength"`** (mean step length), **`"sdStLength"`** (standard deviation of step length), **`"Sinuosity"`** (path tortuosity), **`"Straightness"`** (straight-line efficiency), **`"Velocity"`** (mean velocity), **`"sdVelocity"`** (standard deviation of velocity), **`"MaxVelocity"`** (maximum velocity), and **`"MinVelocity"`** (minimum velocity). This flexibility allows users to focus on specific aspects of movement or combine multiple characteristics to create a comprehensive profile of trackmaker behavior.
The **`data`** argument expects a **`track` R object** and the **`veltrack`** argument requires a **`track velocity` R object**. These parameters are used to enhance the clustering process by incorporating speed-related information alongside spatial metrics.
The results of the clustering process have important **biological implications**. By grouping tracks with similar **movement parameters**, researchers can infer the **behaviors that produced those tracks**. For instance, tracks characterized by **low sinuosity and high velocity** might indicate animals moving directly toward a target, while **high sinuosity and variable step lengths** could suggest **foraging or exploratory behavior**.
Additionally, clustering can reveal **coordinated movement**. If multiple tracks fall within the same cluster and also exhibit high similarity when analyzed with methods like **Dynamic Time Warping (DTW)** or **Fréchet distance**, it could indicate **group movement or coordinated hunting strategies**.
Moreover, the clustering approach offers a practical way to **refine datasets** before applying more computationally demanding tests. By first identifying groups of tracks that share similar characteristics, users can focus on the most relevant comparisons, enhancing the **robustness of their hypothesis testing**. This **pre-clustering step** can significantly improve the efficiency and reliability of subsequent analyses, such as **similarity metrics or intersection tests**.
The output of the **`cluster_track()`** function is a **`track clustering` R object**, which provides detailed information about the clustering process and the resulting classification of tracks based on their movement parameters. The output is structured as a list containing two primary components: **`matrix`** and **`clust`**. The **`matrix`** element is a data frame that holds the calculated movement parameters for each track, including metrics such as turning angles, step lengths, sinuosity, straightness, and various velocity parameters. This matrix serves as the input data for the clustering process, providing a comprehensive summary of the tracks' movement characteristics. The **`clust`** element is an **`Mclust` R object** generated by the **`mclust` package**. This object contains all the results of the model-based clustering analysis, including the optimal model selected according to the **Bayesian Information Criterion (BIC)**. The **`clust`** object includes essential information such as the number of clusters identified (**`G`**), the specific model used (**`modelName`**), the mixing proportions of each component (**`pro`**), the means of each cluster (**`mean`**), and the variance parameters (**`variance`**). Additionally, it provides the **log-likelihood**, **BIC**, and **ICL (Integrated Complete-data Likelihood)** values corresponding to the best-fitting model. The **classification** of tracks is provided within the **`clust`** object as the **`classification`** element, which assigns each track to a specific cluster. The **`z`** matrix within this object gives the probability of each track belonging to each cluster, while the **`uncertainty`** element measures the confidence associated with each classification. The **`cluster_track()`** function's output allows users to assess how tracks are grouped based on their movement patterns, providing valuable insights into potential behavioral strategies. This clustering result can be directly used to guide further analyses, such as comparing clusters with similarity metrics or testing whether clusters correspond to specific ecological or behavioral hypotheses.
The **`cluster_track()`** function provides a flexible and powerful framework for identifying **behavioral patterns** from trackway data, enabling researchers to draw meaningful conclusions about the **behavioral ecology of ancient trackmakers**.
### **Examples of Usage**
First, compute velocities for the subsetted MountTom dataset using the **`velocity_track()`** function. This object will later be passed to **`cluster_track()`**:
```{r, eval=FALSE}
H_mounttom_subset <- c(
1.380, 1.404, 1.320, 1.736, 1.432, 1.508, 1.768, 0.760, 1.688, 1.620, 1.784
)
velocity_sbmounttom <- velocity_track(sbMountTom, H = H_mounttom_subset)
```
```{r echo=FALSE}
H_mounttom_subset <- c(
1.380, 1.404, 1.320, 1.736, 1.432, 1.508, 1.768, 0.760, 1.688, 1.620, 1.784
)
velocity_sbmounttom <- velocity_track(sbMountTom, H = H_mounttom_subset)
```
Next, perform clustering using key variables related to movement behavior: **sinuosity**, **straightness**, **velocity**, and **turning angles**.
```{r, eval=FALSE}
clustering_mounttom <- cluster_track(
data = sbMountTom,
veltrack = velocity_sbmounttom,
variables = c("Sinuosity", "Straightness", "Velocity", "TurnAng")
)
```
```{r include=FALSE}
clustering_mounttom <- cluster_track(
data = sbMountTom,
veltrack = velocity_sbmounttom,
variables = c("Sinuosity", "Straightness", "Velocity", "TurnAng")
)
```
```{r echo=FALSE}
print(clustering_mounttom)
```
We then check the classification of tracks.
```{r, eval=FALSE}
clustering_mounttom$clust$classification
```
```{r echo=FALSE}
clustering_mounttom$clust$classification
```
And the probability of each track belonging to each cluster.
```{r, eval=FALSE}
clustering_mounttom$clust$z
```
```{r echo=FALSE}
clustering_mounttom$clust$z
```
The resulting clusters can help infer **behavioral strategies** (e.g., direct vs. exploratory movement), and serve as a **pre-selection step** for deeper hypothesis testing. Tracks in the same cluster can be further analyzed with **similarity metrics**, **intersection metrics**, and the **combined_prob()** function to test whether observed movement patterns reflect non-random, coordinated, or functionally distinct behaviors—while also **reducing computational cost** by narrowing the scope of comparisons.
## **References**
- **Alexander, R. M. (1976).** Estimates of speeds of dinosaurs. *Nature*, 261(5556), 129-130. DOI: .
- **Batschelet, E. (1981).** Circular statistics in biology. *Academic Press*, 388 pp.\
- **Benhamou, S. (2004).** How to reliably estimate the tortuosity of an animal's path: straightness, sinuosity, or fractal dimension?. *Journal of Theoretical Biology*, 229(2), 209-220. DOI: .
- **Farlow, J. O., O’Brien, M., Kuban, G. J., Dattilo, B. F., Bates, K. T., Falkingham, P. L., & Piñuela, L. (2012).** Dinosaur Tracksites of the Paluxy River Valley (Glen Rose Formation, Lower Cretaceous), Dinosaur Valley State Park, Somervell County, Texas. *In Proceedings of the V International Symposium about Dinosaur Palaeontology and their Environment* (pp. 41-69). Burgos: Salas de los Infantes.\
- **Ostrom, J. H. (1972).** Were some dinosaurs gregarious? *Palaeogeography, Palaeoclimatology, Palaeoecology*, 11(4), 287-301. DOI: [https://doi.org/10.1016/0031-0182(72)90049-1](https://doi.org/10.1016/0031-0182(72)90049-1).\
- **Rohlf, F. J.** 2008. **tpsUtil. Version 1.40.** Department of Ecology and Evolution, State University of New York.\
Available at: \
- **Rohlf, F. J.** 2009. **tpsDig. Version 2.14.** Department of Ecology and Evolution, State University of New York.\
Available at: \
- **Ruiz, J., & Torices, A. (2013).** Humans running at stadiums and beaches and the accuracy of speed estimations from fossil trackways. *Ichnos*, 20(1), 31-35.\
- **Thulborn, R. A., & Wade, M. (1984).** Dinosaur trackways in the Winton Formation (mid-Cretaceous) of Queensland. *Memoirs of the Queensland Museum*, 21(2), 413-517.