Study of data mining techniques, i.e., extraction of knowledge from large volumes of data. The knowledge extraction process includes exploratory analysis, data preprocessing, clustering, and prediction.
This short-course is regularly offered once a year at LNCC under the collaboration between CEFET/RJ and LNCC.
Fill this form to request access to the course.
Slides and schedule available at Moodle.
- Han, M. Kamber, and Pei J. Data Mining: Concepts and Techniques, Morgan Kaufmann Publisher, Burlington, MA, USA, 3rd Edition, 2011.
- Zaki, M.J. and Jr., W.M. Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, Cambridge, United Kingdom, 1st Edition, 2014.
- Witten, I.H., Frank, E. and Hall M.A., Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann Publishers, Burlington, MA, USA, 3rd Edition, 2011.
- Hastie, T., Tibshirani, R., Friedman, J., (2011), The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Publishing, New York, USA, 2 edition: 2013.
- James, G., Witten, D., Hastie, T., Tibshirani, R., (2013), An Introduction to Statistical Learning: with Applications in R., Springer Publishing, New York, USA, 1 edition: 2013.
- Lantz, B., (2013), Machine Learning with R. Packt Publishing Publishing, United Kingdom, 1st Edition, 2013.
- Leskovec, J., Rajaraman, A., Ullman, J.D., (2015), Mining of Massive Datasets. Cambridge University Press, Cambridge, United Kingdom, 2nd Edition, 2015.
- Shumway, R.H., Stoffer, D. S., (2010), Time Series Analysis and Its Applications: With Examples. Publisher Springer, New York, USA, 3 edition: 2010.
Título: Oportunidades na Ciência da Computação: Uma visão naperspectiva de Ciência de Dados Fórum: Escola Municipal Victor Hugo Data: December / 2018 Local: Rio de Janeiro, RJ Resumo: O Brasil atualmente ocupa o sexto maior mercado mundial de tecnologia da informação e comunicação (TIC) (ABES 2016). Estima-se que o setor de TIC tenha movimentado US$ 152 […] Continue reading →
Title: Comparing Motif Discovery Techniques with Sequence Mining in the Context of Space-Time Series Venue: INRIA / LIRMM / University of Montpellier Date: November / 2018 Location: Montpellier, France Abstract: A relevant area that is being explored in time series analysis community is finding patterns. Patterns are sub-sequences of time series that are related to some special […] Continue reading →
The LADaS 2018 Workshop (Latin America Data Science Workshop) was organized in conjunction with the VLDB 2018 (Very Large Data Bases) at Rio de Janeiro on August 27th. Scope: Dealing with the data deluge produced nowadays in different areas, ranging from basic sciences to billions of users of Global Internet services, emerges as one of […] Continue reading →
Title: Rumo à Otimização de Operadores sobre UDF no Spark Venue: CSBC 2018 – BreSci 2018 Date: July / 2018 Location: Natal, RN – Brasil Abstract: Large-scale data analysis has gained much importance in the scientific community due to the Big Data phenomenon. In this context, user-defined functions (UDFs) are commonly implemented in frameworks such as Apache […] Continue reading →
Title: Evaluating Data Preprocessing Methods for Machine Learning Models for Flight Delays Venue: IJCNN 2018 Date: July / 2018 Location: Rio de Janeiro, RJ – Brasil Abstract: Flight delays cause various inconveniences for airlines, airports, and passengers. According to data provided by the Brazilian National Civil Aviation Agency (ANAC), between 2009 and 2015, about 22% of domestic […] Continue reading →
Study of data mining techniques, i.e., knowledge discovery from data (KDD). The KDD process includes the exploratory data analysis, preprocessing, identification of outliers, clustering, prediction, frequent patterns, and data warehouses G. James, D. Witten, T. Hastie, and R. Tibshirani, 2013, An Introduction to Statistical Learning: with Applications in R. 1 ed. Springer. J. Han, […] Continue reading →
Title: Evaluating the Complementarity of Communication Tools for Learning Platforms Venue: CSEDU 2018 Date: March / 2018 Location: Funchal, Madeira – Portugal Abstract: Due to the constant innovations in communications tools, several educational institutions are continually evaluating the adoption of new communication tools (NCT) for their adopted learning platforms (LP). Notably, many educational institutions are interested in […] Continue reading →
Title: Orthographic Educational Game for Portuguese Language Countries Venue: CSEDU 2018 Date: March / 2018 Location: Funchal, Madeira – Portugal Abstract: The new orthographic agreement introduces some changes in the vocabulary of the Portuguese language. Although these changes have modified a small percentage of the vocabulary words, people are struggling to adapt to some of the new […] Continue reading →
The First Latin America Workshop on Data Science (LADaS 2018) A VLDB 2018 workshop August 27, 2018 Rio de Janeiro, Brazil http://eic.cefet-rj.br/ladas2018 CALL FOR PAPERS Dealing with the data deluge produced nowadays in different areas, ranging from basic sciences to billions of users of Global Internet services, emerges as one of the major challenges of […] Continue reading →