Rafaelli Coutinho – PPCIC – Programa de Pós-graduação em Ciência da Computação

Dissertation (March 18, 2026): Nathália Carvalho Tito

Student: Nathália Carvalho Tito

Title: Análise de Desempenho e Características de Corredores para Predição de Resultados e Geração de Feedbacks Personalizados

Advisors: Glauco Fiorott Amorim (Advisor) and Eduardo Bezerra da Silva (Co-advisor)

Committee: Glauco Fiorott Amorim (Cefet/RJ), Eduardo Bezerra da Silva (Cefet/RJ), Diego Nunes Brandão (Cefet/RJ) and Cláudio Miceli de Farias (COPPE/UFRJ).

Day/Hour: March 18, 2026 / 8 a.m.

Room: https://teams.microsoft.com/meet/22404162370488?p=2x3lfHhn8JsNjYtQEL

Abstract: The increasing number of recreational runners has intensified the demand for solutions capable of providing individualized training support, especially among amateur athletes who often lack continuous professional guidance. In this context, this study proposes an integrated performance analysis model based on machine learning techniques, aiming to understand training patterns, predict runners’ performance, and generate personalized recommendations based on controllable variables. Data were obtained from a questionnaire and activity records exported from a running application, involving 26 athletes with different levels of experience, allowing a multidimensional view of training habits and sports experience. The methodology was structured in three main phases. In the first phase, a clustering analysis was performed using classical algorithms K-means, DBSCAN, and Agglomerative Clustering applied to principal components explaining 80\% of the data variability. The selected model identified three distinct profiles: “Young Experienced,” “Less Experienced,” and “Veteran Experienced,” differentiated by age, sports maturity, and training patterns. In the second phase, a predictive model based on gradient boosting was developed using the XGBoost algorithm, both in a general configuration and in versions specific to each cluster. Linear regression models were also tested as a reference approach; however, XGBoost achieved superior performance across all evaluated scenarios, demonstrating a greater ability to capture nonlinear relationships and complex interactions among variables. The results also indicated that each group responded differently to training variables, reinforcing that segmentation substantially improves predictive performance and the adequacy of personalized recommendations. Model interpretability was investigated using SHAP values, which enabled the identification of the most influential variables in the predictions. In general, variables directly related to training structure and performance stood out, such as minimum and maximum speed, distance covered, pace variability, and recent performance history, as well as contextual factors such as temperature. The segmented analysis revealed distinct patterns across groups, indicating that different runner profiles present specific sensitivities to aspects such as intensity, regularity, and training consistency, further supporting the importance of personalized recommendations. In the third phase, an optimization algorithm was applied to identify combinations of controllable variables capable of maximizing the predicted performance for each profile. The evaluation of results was based on comparisons with baseline scenarios, in which decision variables were not adjusted, allowing objective quantification of the gains achieved through optimization. The resulting recommendations showed internal coherence and alignment with cluster characteristics: greater emphasis on intensity and training variety for young experienced runners; focus on regularity, consistency, and strength training for less experienced runners; and strategies centered on maintenance, balance, and load control for veteran runners. These findings demonstrate that the integration of clustering, predictive modeling, and optimization provides a consistent and promising approach for developing data-driven intelligent sports recommendation systems. Despite limitations related to sample size and the absence of more granular physiological indicators, the study provides initial evidence that computational models can effectively support personalized training in an efficient, accessible, and scalable manner. Future research may expand the dataset, incorporate additional informational dimensions, validate the model in larger populations, and explore its implementation in digital platforms. In conclusion, the combination of data science techniques and optimization methods significantly contributes to understanding running performance and to developing individualized recommendations that promote improvement, adherence, and safety in sports practice.

Dissertation (February 18, 2026): Matheus dos Santos Moura

Student: Matheus dos Santos Moura

Title: Hybrid Anomaly and Change Point Detection for Pump-and-Dump Schemes in Centralized Cryptocurrency Exchanges

Advisors: Diogo Silveira Mendonça

Committee: Diogo Silveira Mendonça (Cefet/RJ), Eduardo Soares Ogasawara (Cefet/RJ) and Igor Machado Coelho (UFF)

Day/Hour: February 18, 2026 / 3 p.m.

Room: https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGFkNGM1MzMtOGMzNi00OWU5LTkzYjUtY2JhNGQxZmQzZjBl%40thread.v2/0?context=%7b%22Tid%22%3a%228eeca404-a47d-4555-a2d4-0f3619041c9c%22%2c%22Oid%22%3a%226821740b-ed93-4582-b3a3-b3bfbff6624e%22%7d

Abstract: The rapid growth of cryptocurrency markets has intensified concerns regarding market manipulation practices, particularly pump-and-dump schemes. Detecting such schemes remains challenging due to the high volatility of cryptocurrencies and the limited availability of reliable ground-truth data. Prior work has predominantly relied on anomaly detection techniques, which often exhibit limited precision and adaptability. In this work, we propose two offline statistical methods that explore a hybrid framework combining anomaly detection and change point detection for pump-and-dump detection. The first method, HD Pump, integrates volatility anomaly detection in price time series with change point detection applied to trading volume. The second method, HD Pump Plus, extends this approach by replacing the price time series with a rush-order-based time series. Experimental evaluation on a dataset of 178 confirmed pump-and-dump events from the Binance exchange shows that HD Pump Plus outperforms prior statistical approaches, achieving a precision of 96.4%, recall of 89.3%, and F1-score of 92.7%. These results demonstrate the effectiveness of hybrid detection strategies in advancing the state of the art while maintaining methodological simplicity.

Dissertation (January 21, 2026): Edson Paulo da Silva Pinto Sobrinho

Student: Edson Paulo da Silva Pinto Sobrinho

Title: Fine-tuning detection criteria to enhance anomaly identification in time series

Advisors: Eduardo Soares Ogasawara (advisor) and Kele Teixeira Belloze (co-advisor)

Committee: Eduardo Soares Ogasawara (CEFET/RJ), Kele Teixeira Belloze (CEFET/RJ), Rafaelli de Carvalho Coutinho (CEFET/RJ) and Esther Pacitti (INRIA / University of Montpellier)

Day/Hour: January 10, 2026 / 10:30 a.m.

Room: https://teams.microsoft.com/meet/24688731415374?p=YVHyDYwIW66rCysKq0

Abstract: Anomaly Detection (AD) is the problem of identifying observations that do not conform to typical ones in a time series. Detection methods implicitly define detection criteria, such as deviation measures, filter thresholds, and candidate anomaly selection strategies. Choosing inappropriate criteria results in inaccurate outputs, generating spurious alerts or missing events. Adjusting these criteria is essential for monitoring systems. To address this challenge, this study explores the fine-tuning of deviation measures, filter thresholds, and candidate selection strategies. Experimental results show that the proper choice of criteria significantly improves AD performance, often with greater impact than changing the detection methods.

Dissertation (January 08, 2026): Luiz Cláudio Lemos de Oliveira

Student: Luiz Cláudio Lemos de Oliveira

Title: Motif Detection in Time Series Using Autoencoders: An Analysis of Their Application to ECG Data

Advisor: Eduardo Soares Ogasawara

Committee: Eduardo Soares Ogasawara (CEFET/RJ), Laura Silva de Assis (CEFET/RJ), Helga Dolorico Balbi (CEFET/RJ) and Rebecca Pontes Salles (INRIA/FRA)

Day/Hour: January 08, 2026 / 10 a.m.

Room: https://teams.microsoft.com/meet/2899124353229?p=ykqgR5NeTJfPfXOhSF

Abstract: The discovery of motifs in biomedical time series, such as electrocardiograms (ECGs), involves identifying recurrent patterns that may contain valuable diagnostic information. Traditional methods, such as SAX, are limited by strong statistical assumptions, which are particularly inadequate for complex physiological signals. In parallel, autoencoders have demonstrated superior ability to learn nonlinear representations, but their application to motif discovery in ECG data remains unexplored, constituting a significant methodological gap. This work proposes a framework that replaces SAX discretization with neural encoding while preserving the discovery pipeline based on Shannon’s entropy and frequency of occurrence. The methodology was developed in three stages: (i) validation of the autoencoder’s reconstruction ability, (ii) training models with data from the MIT-BIH Arrhythmia Database, and (iii) systematic experimental comparison with the traditional SAX method through detection experiments, parametric sensitivity analysis, and evaluation of generalization capacity. It is concluded that replacing traditional discretization with neural encoding is feasible and provides quantitative and qualitative gains in motif discovery in ECG signals, establishing a methodological basis for developing automated biomedical signal analysis tools.

Dissertation (December 18, 2025): Michel Siqueira Reis

Student: Michel Siqueira Reis

Title: Matching Detections to Events in Time Series with Computational Efficiency and Guaranteed Optimality

Advisors: Rafaelli Coutinho (Advisor) and Eduardo Ogasawara (Co-advisor)

Committee: Rafaelli Coutinho (Cefet/RJ), Eduardo Ogasawara (Cefet/RJ), Laura Assis (Cefet/RJ) and Rebecca Salles (INRIA)

Day/Hour: December 18, 2025 / 1 p.m.

Room: https://teams.microsoft.com/l/meetup-join/19%3ae20c8697654543fc9dd1e9924de5c2c0%40thread.tacv2/1763159776096?context=%7b%22Tid%22%3a%228eeca404-a47d-4555-a2d4-0f3619041c9c%22%2c%22Oid%22%3a%2254af42a0-5f30-4905-ac8d-10b96c6db26b%22%7d

Abstract: This work presents SmartSoftED, an optimized metric for evaluating the detection of point events in time series. The original metric, SoftED, introduces a “soft” evaluation based on temporal tolerance, assigning gradual scores to detections that occur near actual events. However, its current formulation relies on a greedy approach that does not guarantee optimality in all cases and incurs a quadratic computational cost, limiting its applicability in large-scale or real-time processing environments. SmartSoftED overcomes these limitations by introducing a strategy that decomposes the problem into manageable, disjoint subproblems: some can be solved efficiently without loss of optimality, while others are modeled as maximum-weighted matching problems on unbalanced bipartite graphs. This approach preserves the optimality of correspondences between detections and events while significantly reducing computational cost. In practice, the method achieves an average speedup of two orders of magnitude, making it suitable for certain large-scale applications and systems with strict temporal constraints.

Dissertation (December 8, 2025): Fernando Henrique de Jesus Fraga da Silva

Student: Fernando Henrique de Jesus Fraga da Silva

Title: Aprendizado por Reforço Profundo Aplicado à Negociação Intradiária de Múltiplas Ações

Advisors: Eduardo Bezerra da Silva (advisor) and Pedro Henrique González Silva (co-advisor)

Committee: Eduardo Bezerra da Silva (Cefet/RJ), Pedro Henrique González Silva (UFRJ), Aline Marins Paes Carvalho (UFF) e Glauco Fiorott Amorim (Cefet/RJ)

Day/Hour: December 8, 2025 / 3 p.m.

Room: https://teams.microsoft.com/v2/?meetingjoin=true#/l/meetup-join/19:PKOJTuK7mfHSDE6QkCWQCYp71f0xOMNoRgSUj4wjMKc1@thread.tacv2/1763760050816?context=%7b%22Tid%22%3a%228eeca404-a47d-4555-a2d4-0f3619041c9c%22%2c%22Oid%22%3a%22c03d6068-4733-48a6-bbb4-aa78f351d9cf%22%7d&anon=true&deeplinkId=91733be2-9804-4f09-ac6a-f1a362e67de8

Abstract: The stock market is a dynamic and volatile environment in which publicly traded companies negotiate fractions of their value, subject to continuous price fluctuations influenced by economic, political, and social factors. Anticipating these fluctuations is a complex task, especially in the context of intraday trading, where buy and sell decisions must be made within very short time intervals based on rapidly changing data. In this scenario, Reinforcement Learning (RL) emerges as a promising paradigm capable of developing adaptive strategies through the continuous interaction between agent and environment. This dissertation investigates the use of Deep Reinforcement Learning (DRL) techniques in financial trading, focusing on intraday scenarios involving multiple stocks. It proposes a DRL-based approach to estimate buy and sell actions simultaneously across various assets, using high-granularity market data to better approximate real trading conditions. Experimental analyses were conducted using the Proximal Policy Optimization (PPO) algorithm. The results indicate that the proposed agent outperformed traditional benchmark strategies, achieving gains exceeding 10 percentage points in certain cases.

Algorithms and Graph Based Models

The field of Graph Theory studies the relationships between elements, called nodes, and their connections, known as edges. This area encompasses models ranging from technological networks to social and air transportation networks. Its main subfields include Network Science, which analyzes interactions in complex systems, and Computer Networks, which provide the technological infrastructure for global communication.

Network Science investigates how the structure and dynamics of connections influence the global behavior of a network. Topics such as centrality, robustness, and structural patterns are analyzed to better understand social, economic, and biological networks. The growth of technology and the explosion of data in recent decades have further increased the relevance of this field.

In Computer Networks, defining the network topology is essential for efficient monitoring. This process can be modeled as an optimization problem or analyzed as a Complex Network, using graph-theoretic concepts to study its properties and performance. Moreover, infrastructure management and data communication rely on specific protocols tailored to different applications, such as environmental monitoring, mobile networks, and biomedical systems. The efficiency of these protocols is evaluated using metrics such as packet delivery rate, network throughput, and energy consumption.

This project aims to develop graph-based applications across various domains, combining computational simulation with practical experiments. It also seeks to improve the design and communication within these graph structures, exploring new protocols to make information transmission more efficient and resilient.

Faculty Members Involved:

Diego Nunes Brandão (coordinator)
Felipe da Rocha Henriques
Glauco Fiorott Amorim
Helga Dolorico Balbi
Laura Silva de Assis

Smart Applications

Intelligent Applications have become essential for optimizing processes and enabling informed decision-making. Their integration with Robotics, Multimedia, and the Internet of Things (IoT) drives significant innovation across multiple domains.

In Robotics, intelligent applications enhance machine autonomy and interaction, enabling solutions that range from personal assistant robots to advanced surgical systems. A special focus is placed on educational robotics, which combines state-of-the-art technology with playful, interactive approaches to develop intelligent embedded systems and perception algorithms. These solutions are often tested in technology competitions to refine their performance before being applied in educational contexts.

Multimedia has transformed the way information is consumed by integrating video, audio, images, and text with intelligent algorithms. This enables personalized user experiences, speech and image recognition, and immersive virtual reality environments, resulting in more intuitive and multisensory interactions.

In IoT, Artificial Intelligence allows everyday objects to collect and analyze data to create more efficient and secure environments. The convergence of IoT and AI gives rise to AIoT (Artificial Intelligence of Things), which incorporates advanced learning and decision-making capabilities into connected devices.

This research project explores how these technologies can transform teaching and learning, synchronize multisensory effects, and support environmental monitoring, enabling the development of more autonomous and efficient systems.

Faculty Members Involved:

Joel Andre Ferreira dos Santos (coordinator)
João Roberto de Toledo Quadros
Glauco Fiorott Amorim
Diego Nunes Brandão

Data Analysis

Data Analysis is a multidisciplinary field focused on interpreting large volumes of information to support decision-making, strategy development, and innovation. Statistical and machine learning techniques are employed to identify patterns and forecast future events, encompassing structured, semi-structured, and unstructured data.

For structured data, the main challenges involve analyzing time series and spatiotemporal data, including prediction, pattern discovery, and adaptation to data drift. Methods such as filtering and decomposition are used to build robust models for forecasting. The detection of events in time series, such as anomalies and regime changes, is relevant for both retrospective and real-time analysis.

When dealing with semi-structured and unstructured data, challenges include text mining and natural language processing (NLP). Text mining aims to uncover patterns and trends through statistical learning and text vectorization, supporting applications such as sentiment analysis and affective computing, which studies emotions in texts and human interactions. In this project, text mining is closely linked to affective computing and behavioral analysis, also encompassing image and video processing.

Behavioral analysis examines individuals within social networks, using graph-based models to identify communities and understand interaction dynamics. Applications include targeted marketing and information diffusion, providing insights into collective and emotional patterns within human interactions.

Faculty Members Involved:

Eduardo Soares Ogasawara (coordinator)
Eduardo Bezerra da Silva
Gustavo Paiva Guedes e Silva
Jorge de Abreu Soares
Kele Teixeira Belloze

Software Engineering

Software Engineering is the field that studies and applies scientific and technological methods to the software life cycle, ensuring systematic and disciplined approaches to development. With the growing reliance on software in smartphones, computers, and wearable devices, the quality and security of these systems have become fundamental. Furthermore, emerging technologies such as Artificial Intelligence, the Internet of Things (IoT), Blockchain, and Virtual Reality impose new challenges on software engineering.

This research project investigates how software engineering can be applied to these technologies to maximize their societal benefits. In the context of Blockchain, for instance, smart contracts enable innovative services, but code vulnerabilities can lead to million-dollar losses, making security a critical concern. In IoT, security is equally essential, as failures can compromise hardware or even endanger human lives. Developing secure, scalable, and reliable systems thus becomes a central challenge within Software Engineering.

Educational games are another important application, supporting learning through exploration within the game environment. The use of data provenance makes it possible to analyze player actions, revealing their behavior and strategies.

This project also welcomes additional investigations into emerging technologies and their societal impact, exploring innovative approaches to software development.

Faculty Members Involved:

Diogo Silveira Mendonça (coordinator)
Joel André Ferreira dos Santos

Machine Learning and Optimization

Machine Learning (ML) is a branch of Artificial Intelligence dedicated to developing new algorithms and methodologies capable of identifying patterns and making decisions without explicit programming. Beyond practical applications, progress in this field depends on creating novel theoretical and computational approaches that enhance the efficiency, interpretability, and generalization capacity of models.

This research project investigates advanced ML methods, spanning traditional techniques, such as deep neural networks and probabilistic models, to emerging approaches including self-supervised learning, generative models, federated learning, and reinforcement learning. Additionally, the project aims to improve strategies for explainability and interpretability to make models more transparent and trustworthy, especially in critical applications.

A second fundamental pillar of this project is Optimization, a field that integrates with ML to improve model performance and solve complex problems across different domains. The project focuses on the design and application of methods for solving problems using linear, nonlinear, integer, and mixed-integer programming (through exact and/or heuristic methods), as well as bio-inspired metaheuristics such as ant colony optimization, genetic algorithms, and particle swarm optimization. Optimization techniques are applied to tasks such as tuning machine learning model parameters, feature selection, and neural network architecture design.

Finally, Affective Computing explores how ML algorithms can interpret, process, and respond to human emotional states. This includes investigating new methods for fusing physiological and emotional signals. The goal is to advance the development of systems capable of adapting their responses in more natural and empathetic ways, with applications ranging from conversational interfaces to interactive robotics.

Faculty Members Involved:

Eduardo Bezerra da Silva (coordinator)
Gustavo Paiva Guedes e Silva
Diogo Silveira Mendonça
Diego Moreira de Araújo Carvalho
Laura Silva de Assis

Database Management and Administration

The growing volume of data requires organizations to develop strategies for extracting valuable insights and gaining competitive advantage. This process involves the collection, storage, integration, and analysis of structured, semi-structured, and unstructured data. The research investigates methodologies for managing and transforming these data into useful knowledge to support decision-making.

The focus lies on data-centric artificial intelligence (Data-Centric AI) for data preparation and on large-scale processing techniques. One of the challenges addressed is the parallel and distributed processing of massive volumes of heterogeneous data, common in fields such as bioinformatics, astronomy, and engineering. Scientific workflows are essential for these experiments and are frequently executed on clusters, supercomputers, and cloud environments.

The project also explores frameworks such as Apache Spark, optimizing workflows for large-scale data analysis and management. In addition, it investigates conceptual modeling techniques, ontologies, preprocessing, indexing, and querying in Big Data systems. The research considers approaches based on distributed storage (HDFS), NoSQL databases, NewSQL systems, and object-relational databases, aiming to enhance the efficiency of data handling and analysis.

Faculty Members Involved:

Rafaelli de Carvalho Coutinho (coordinator)
Eduardo Soares Ogasawara
Diego Moreira de Araújo Carvalho
Jorge de Abreu Soares
Kele Teixeira Belloze

Author: Rafaelli Coutinho