Educational data mining with machine learning
DOI:
https://doi.org/10.15536/reducarmais.5.2021.2417Keywords:
Educational Data Mining, Performance Prediction, Machine LearningAbstract
With the increase in the availability of data, especially in the educational context, specific areas have emerged for the extraction of relevant information, such as Educational Data Mining (MDE), which integrates numerous techniques that support the capture, processing and analysis of these sets of records. The main technique associated with MDE is Machine Learning, which has been used for decades in data processing in different contexts, but with the advent of Big Data there was an intensification in the application of this technique in order to extract relevant information from a huge amount of data. In this sense, this study aims to predict the performance of students, using a set of public data, and to compare which of the Machine Learning algorithms used was the most effective, in addition to indicating which are the main predictive attributes for student performance. For this, an EAW process based on 4 steps was implemented: 1) Data collection; 2) Extraction of resources and cleaning of data (Pre-Processing and Transformation); and 3) Analytical processing and algorithms; 4) analysis and / or interpretation of results. As a result, it was identified that for the data set used in this study, the Decision Trees algorithm was the most accurate - with an accuracy of 87% - as well as it was found that attributes related to school activities are more predictive of student performance than demographic and socioeconomic characteristics data.
Downloads
References
AGGARWAL, Charu C. Data Mining: The Textbook. 1. ed. New York, USA: Springer, 2015. v. 1E-book. Disponível em: https://doi.org/10.1007/978-3-319-14142-8
BAKER, Michael J. The roles of models in Artificial Intelligence and Education research : a prospective view. Journal of Artificial Intelligence and Education, [S. l.], v. 11, p. 122–143, 2000.
BAKER, Ryan; ISOTANI, Seiji; CARVALHO, Adriana. Mineração de Dados Educacionais: Oportunidades para o Brasil. Revista Brasileira de Informática na Educação, [S. l.], v. 19, n. 02, p. 3–13, 2011. Disponível em: https://doi.org/10.5753/rbie.2011.19.02.03
BISHOP, Christopher M.; PATTERN. Pattern Recognition and Machine Learning. 1. ed. Nova York, USA: Springer, 2006. E-book.
CHUI, Kwok Tai et al. Predicting at-risk university students in a virtual learning environment via a machine learning algorithm. Computers in Human Behavior, [S. l.], v. 107, n. December 2017, p. 105584, 2020. Disponível em: https://doi.org/10.1016/j.chb.2018.06.032
CORTEZ, P.; SILVA, A. Usando Data Mining para prever o desempenho dos alunos do ensino médio. In A. Brito e J. Teixeira Eds., Proceedings of 5th Future Business Technology Conference (FUBUTEC 2008). Porto, Portugal: EUROSIS, ISBN 978-9077381-39-7, 2008. p. 5-12.
DE LOS REYES, Daniel A. Guimarães et al. Predição de sucesso acadêmico de estudantes: uma análise sobre a demanda por uma abordagem baseada em transfer learning. Revista Brasileira de Informática na Educação, [S. l.], v. 27, n. 1, p. 1–25, 2019. Disponível em: https://doi.org/10.5753/rbie.2019.27.01.01
EDM. Educational Data Mining. [s. l.], 2020. Disponível em: http://educationaldatamining.org/. Acesso em: 31 maio. 2020.
HASTIE, T.; TIBSHIRANI, R.; FRIEDMAN, J. H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2. ed. California, USA: Springer, 2009. E-book.
IGUAL, Laura; SEGUÍ, Santi. Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications. 1. ed. [S. l.]: Springer, 2017. E-book. Disponível em: https://doi.org/10.1007/978-3-319-50017-1
JAPKOWICZ, Nathalie; SHAH, Mohak. Evaluating Learning Algorithms: A Classification Perspective. 1a Ed. ed. Cambridge: [s. n.], 2014. E-book.
KUBAT, Miroslav. An Introduction to Machine Learning. 2. ed. Coral Gables, FL, USA: Springer, 2017. E-book. Disponível em: https://doi.org/10.1007/978-3-319-63913-0
LANDIS, J. Richard; KOCH, Gary G. This content downloaded from 185.2.32.58 on Tue. [S. l.], v. 33, n. 2, p. 363–374, 1977.
MITCHELL, Tom M. Machine Learning. 1. ed. Nova York, USA: McGraw-Hill Science/Engineering/Math, 1997. E-book.
RIGO, Sandro José et al. Minerando Dados Educacionais com foco na evasão escolar: oportunidades, desafios e necessidades. Revista Brasileira de Informática na Educação, [S. l.], v. 22, n. 01, p. 168–177, 2014. Disponível em: https://doi.org/10.5753/RBIE.2014.22.01.132
RODRIGUES, Rodrigo Lins et al.Discovery engagement patterns MOOCs through cluster analysis. IEEE Latin America Transactions, [S. l.], v. 14, n. 9, p. 4129–4135, 2016. Disponível em: https://doi.org/10.1109/TLA.2016.7785943
ROMERO, Cristobal; VENTURA, Sebastian. Data mining in education. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, [S. l.], v. 3, n. 1, p. 12–27, 2013. Disponível em: https://doi.org/10.1002/widm.1075
SAMUEL, A. L. Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, [S. l.], v. 3, n. 3, p. 210–229, 1959. Disponível em: https://doi.org/10.1147/rd.33.0210
ZHANG, Yaling; WU, Bei. Research and application of grade prediction model based on decision tree algorithm. In: 2019, Chengdu, China. Turing Celebration Conference (ACM TURC 2019). Chengdu, China: ACM, 2019. p. 1–6. Disponível em: https://doi.org/10.1145/3321408.3322857
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 Vanessa Faria de Souza
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
DECLARATION OF RESPONSIBILITY: I hereby certify that I partially or fully participated in the conception of the work, that I did not hide any links or financial agreements between the authors and companies that may be interested in this article publication. I certify that the text is original and that the work, partially or fully, or any other work with a substantially similar content written by me, was not sent to any other journal and it will not be send while my submission is being considered by Revista Educar Mais, whether in printed or electronic format.
The author responsible for the submission represents all the authors of the manuscript and, when sending the article to the journal, guarantees s/he has obtained the permission to do so, as well as s/he guarantees the article does not infringe upon anyone’s copyright nor violate any proprietary rights. The journal is not responsible for the opinions expressed.
Revista Educar Mais is Open Access, does not charge any fees, whether for submission or article processing. The journal adopts Budapest Open Access Initiative (BOAI)’s definition, i.e., any users are permitted to read, download, copy, distribute, print, search and link to the full texts of these articles.
All the articles are published under the Creative Commons Atribuição-NãoComercial 4.0 Internacional license. The authors keep the copyright of their production. That way, they must be contacted directly if there is any interest in commercial use of their work.