Spatial-Time Motifs Discovery

Authors: Heraldo Borges, Murillo Dutra,  Rafaelli~Coutinho, Fábio Perosi, Amin Bazaz,  Florent~Masseglia, Esther Pacitti, Fábio Porto, Eduardo Ogasawara
Abstract: Discovering motifs in time series data has been widely explored. Various techniques have been developed to tackle this problem. However, when it comes to spatial-time series, a clear gap can be observed according to the literature review. This paper tackles such gap by presenting an approach to discover and rank motifs in spatial-time series, denominated Combined Series Approach (CSA). CSA is based on partitioning the spatial-time series into blocks. Inside each block, subsequences of spatial-time series are combined in a way that hash-based motif discovery algorithm is applied. Motifs are validated according to both temporal and spatial constraints. Later, motifs are ranked according to their entropy, the number of occurrences, and the proximity of their occurrences. The approach was evaluated using both synthetic and seismic datasets. CSA outperforms traditional methods designed only for time series. CSA was also able to prioritize motifs that were meaningful both in the context of synthetic data and also according to seismic specialists.
Synthetic dataset:
An example with 12 spatial time series. Using a traditional approach only a single motif in ST3 is found.

CSA approach creates some combined series from all the time series, which enables the motif discovery algorithm to discover candidate motifs that explore both spatial and time properties of the time series.

The motifs discovered are mapped into the time series and checked if they are, in fact, spatial-time motifs.

Seismic dataset:
Top motifs discovered according to CSA ranking function.
Top motifs discovered according to the number of occurrences.
Code repository at Git-Hub: https://github.com/eogasawara/CSA
Acknowledgments: The authors thank CAPES, CNPq, and FAPERJ for partially sponsoring this work.

Eduardo Ogasawara

Eduardo Ogasawara has been a professor at the Department of Computer Science at the Federal Center for Technological Education of Rio de Janeiro (CEFET/RJ) since 2010. He holds a D.Sc. in Systems and Computer Engineering from COPPE/UFRJ. Between 2000 and 2007, he worked in the Information Technology (IT) sector, gaining extensive experience in workflows and project management. With a strong background in Data Science, he is currently focused on Data Mining and Time Series Analysis. He is a member of IEEE, ACM, and SBC. Throughout his career, he has authored numerous published articles and led projects funded by agencies such as CNPq and FAPERJ. Currently, he heads the Data Analytics Lab (DAL) at CEFET/RJ, where he continues to advance research in Data Science.