This package STSMotifs allows to perform a research of motif in spatial-time series. The main purpose is to find a way to handle the issue of large amounts of data. The package offers a way to do this research quickly and efficiently. To find the motifs, the CSAMiningProcess is used. The process is decomposed by several steps :
To use functions of this package, some inputs are needed. The quality of outputs depends strongly by these parameters.
dataset : Dataframe which contains numerics values. Columns represent the space and rows the time.#>      1    2    3     4     5     6    7     8    9   10
#> 1  737 1350  869   750  1138   758 1006  1095   99  -83
#> 2  283  565  504   317  1849   944  -80  -895 -936  906
#> 3 -118 -375 -564  -803   870   472 -922 -1009 -698  741
#> 4 -696 -844 -654 -1303  -474  -591 -262  1034 1012  376
#> 5 -251 -622  -14  -587 -1108 -1401  404  1545 1696  247
#> 6  645  -10   -4   411  -858 -1261 -574  -329 -367 -680alpha: The size of the alphabet used to encode the numerical values into a string with SAX.
word: The length of the motif.
sb and tb: The spatial and temporal block sizes.
A part of the process is applied into blocks (subsets of the original dataset). With the tb (“Time slice” number of rows in each block) and sb (“Space slice” number of columns in each block), the user can specify the block size and the block shape.
kappa : Threshold to check the minimal number of spatial-time series occurrences inside each motif.
sigma : Threshold to check the minimal number of occurrences inside each motif.
This first step, described by the NormSAX function, applies z-score data normalization in the entire dataset. Right after normalization, SAX indexing method is applied for a given alphabet a.
See more at Normalization and SAX Indexing
In this step, using sb and tb parameters, we create blocks from the original spatial-time dataset. All subsequences inside each block are combined to create a single time series. From this combined time series, motifs are verified using kappa and sigma thresholds. Then, all the occurrences of motifs from neighboring blocks are grouped.
See more at Search for Spatial-time Motifs
The last step, described by the \(RankSTMotifs\) function, makes a balance between distance among the occurrences of a motif with the encoded information on the motif itself and his quantity. It explores all motifs and their occurrences.
There are three ways to visualize the result:
To see an example of output : Output Example