Spatial-Time Motifs Discovery

Authors: Heraldo Borges, Murillo Dutra,  Rafaelli~Coutinho, Fábio Perosi, Amin Bazaz,  Florent~Masseglia, Esther Pacitti, Fábio Porto, Eduardo Ogasawara
Abstract: Discovering motifs in time series data has been widely explored. Various techniques have been developed to tackle this problem. However, when it comes to spatial-time series, a clear gap can be observed according to the literature review. This paper tackles such gap by presenting an approach to discover and rank motifs in spatial-time series, denominated Combined Series Approach (CSA). CSA is based on partitioning the spatial-time series into blocks. Inside each block, subsequences of spatial-time series are combined in a way that hash-based motif discovery algorithm is applied. Motifs are validated according to both temporal and spatial constraints. Later, motifs are ranked according to their entropy, the number of occurrences, and the proximity of their occurrences. The approach was evaluated using both synthetic and seismic datasets. CSA outperforms traditional methods designed only for time series. CSA was also able to prioritize motifs that were meaningful both in the context of synthetic data and also according to seismic specialists.
Synthetic dataset:
An example with 12 spatial time series. Using a traditional approach only a single motif in ST3 is found.

CSA approach creates some combined series from all the time series, which enables the motif discovery algorithm to discover candidate motifs that explore both spatial and time properties of the time series.

The motifs discovered are mapped into the time series and checked if they are, in fact, spatial-time motifs.

Seismic dataset:
Top motifs discovered according to CSA ranking function.
Top motifs discovered according to the number of occurrences.
Code repository at Git-Hub: https://github.com/eogasawara/CSA
Acknowledgments: The authors thank CAPES, CNPq, and FAPERJ for partially sponsoring this work.

Eduardo Ogasawara

I am a Professor of the Computer Science Department of the Federal Center for Technological Education of Rio de Janeiro (CEFET / RJ) since 2010. I hold a PhD in Systems Engineering and Computer Science at COPPE / UFRJ. Between 2000 and 2007 I worked in the Information Technology (IT) field where I acquired extensive experience in workflows and project management. I have solid background in the Databases and my primary interest is Data Science. He currently studies space-time series, parallel and distributed processing, and data preprocessing methods. I am a member of the IEEE, ACM, INNS, and SBC. Throughout my career I have been presenting consistent number of published articles and projects approved by the funding agencies, such as CNPq and FAPERJ. I am also reviewer of several international journals, such as VLDB Journal, IEEE Transactions on Service Computing and The Journal of Systems and Software. Currently, I am heading the Post-Graduate Program in Computer Science (PPCIC) of CEFET / RJ.