Defesa de Dissertação de Mestrado do Riccardo Campisano. O trabalho é intitulado “Sequence Mining in Spatial-Time Series” e é orientado pelo professores Eduardo Ogasawara (CEFET/RJ) e coorientado pelo Prof. Florent Masseglia (INRIA / LIRMM). A defesa ocorrerá no auditório V no dia 7 de julho de 2017 às 9h e será toda em Inglês.
Além dos orientadores, a banca é formada pelos seguintes professores:
Esther Pacitti (INRIA / LIRMM)
Fabio Porto (LNCC)
Diego Carvalho (CEFET/RJ)
The problem of discovering sequential patterns in large datasets affects a wide range of scientific and industrial applications. In a growing number of applications, data are collected as spatio-temporal sequences that associate to each item in the sequence a time and a spatial position. This leads to an appealing new challenge for this domain: find, with the same process: i) frequent sequences constrained in space and time that may not be frequent in the entire dataset and ii) the time interval and space range where these sequences are frequent. The discovery of such patterns along with their constraints may lead to extract important knowledge that can remain hidden using traditional methods since their support is extremely low over the entire dataset. We introduce a new Spatio-Temporal Sequence Miner (STSM) algorithm for such purpose. STSM is based on our novel sequential pattern mining principle in order to detect spatial ranges where sequences are frequent. Next, it composes all detected sequences inside each range to discover the ones constrained in space and time where these sequences are frequent. Even though our solution is generic, we evaluate STSM on a seismic use case and illustrate its ability to detect frequent sequences constrained in space and time. We managed to identify 1,500 constrained sequences under high support threshold, which would not have been found using current techniques. Moreover, the identified sequences from STSM correspond to candidate areas for seismic horizons and bright spots that are of high value for domain experts. To the best of our knowledge, this is the first solution to tackle the problem of identifying frequent sequences constrained in space and time.