Authors: Heraldo Borges, Murillo Dutra, Rafaelli~Coutinho, Fábio Perosi, Amin Bazaz, Florent~Masseglia, Esther Pacitti, Fábio Porto, Eduardo Ogasawara
Abstract: Discovering motifs in time series data has been widely explored. Various techniques have been developed to tackle this problem. However, when it comes to spatial-time series, a clear gap can be observed according to the literature review. This paper tackles such gap by presenting an approach to discover and rank motifs in spatial-time series, denominated Combined Series Approach (CSA). CSA is based on partitioning the spatial-time series into blocks. Inside each block, subsequences of spatial-time series are combined in a way that hash-based motif discovery algorithm is applied. Motifs are validated according to both temporal and spatial constraints. Later, motifs are ranked according to their entropy, the number of occurrences, and the proximity of their occurrences. The approach was evaluated using both synthetic and seismic datasets. CSA outperforms traditional methods designed only for time series. CSA was also able to prioritize motifs that were meaningful both in the context of synthetic data and also according to seismic specialists.
Synthetic dataset:
An example with 12 spatial time series. Using a traditional approach only a single motif in ST3 is found.
CSA approach creates some combined series from all the time series, which enables the motif discovery algorithm to discover candidate motifs that explore both spatial and time properties of the time series.
The motifs discovered are mapped into the time series and checked if they are, in fact, spatial-time motifs.
Seismic dataset:
Top motifs discovered according to CSA ranking function.
Top motifs discovered according to the number of occurrences.
Available at CRAN: https://CRAN.R-project.org/ package=STMotif
Code repository at Git-Hub: https://github.com/eogasawara/ CSA
Acknowledgments: The authors thank CAPES, CNPq, and FAPERJ for partially sponsoring this work.