Authors: Antonio Castro, Heraldo Borges, Ricardo Campisano, Fabio Porto, Reza Akbarinia, Florent Masseglia, Esther Pacitti, Rafaelli Coutinho ans Eduardo Ogasawara
Abstract: Finding patterns is an important task for different domains. Spatio-temporal patterns brings knowledge about the time and position where a patter is frequent. But not all patterns are frequent over a entire dataset, some can be constrained in spatial positions and time range. Mining tight space-time sequences has as objective to discover frequent sequences, the time range and the set of positions in which these sequences are frequent.
Based on the Apriori algorithm and using concepts of ranged group, greedy-ranged group and solid-ranged group, this paper proposes STSM-2S1T algorithm as a solution to the discovery of frequent sequences that are constrained in one dimension in time and in two dimensions in space. Using a real-world spatio-temporal seismic dataset, STSM-2S1T was compared with a simple approach and extensively evaluated to analyze its sensitivity. As result, STSM-2S1T presented a better performance and low variation in resources usage as input parameters change.
Acknowledgments: The authors would like to thank CAPES, CNPq, and FAPERJ for partially funding this paper.
T401 dataset: The Netherlands seismic spatial-time series dataset, named F3 Block, was produced by the seismic reflection method in a region located in the Dutch sector of the North Sea. The seismic data is obtained by sending high-energy sound waves into the ground or seabed as the case. The amplitude of the reflected sound waves is registered, the later the reflected sound wave arrives deeper in the soil it was reflected.
The dataset is available in: dataset.RData
As a result, this dataset contains observations that are related to the time the sound wave arrives and attributes that are related to the position of the hydrophone which registered the reflected sound wave, a set of time series.
The results presented in this work were focused on public data of the inline 401.
It is composed by 951 spatial-time series with 462 observations.
Patterns previously set by experts: The location of these patterns is of key importance for oil and gas prospects.
The file that contains the positions of the patterns is available in: patterns.RData