Data mining approach to predict student's selection of program majors

Students in higher education do not have access to sufficient information when selecting their program major. Program administrators cannot easily predict majors that will be undersubscribed early enough to take corrective actions. At the same time, institutional databases have large volumes of data...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: SIDDARTHA, SHARMILA (author)
منشور في: 2019
الموضوعات:
الوصول للمادة أونلاين:https://bspace.buid.ac.ae/handle/1234/1509
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1862980616447852544
author SIDDARTHA, SHARMILA
author_facet SIDDARTHA, SHARMILA
author_role author
dc.creator.none.fl_str_mv SIDDARTHA, SHARMILA
dc.date.none.fl_str_mv 2019-10-30T07:42:51Z
2019-10-30T07:42:51Z
2019-06
dc.format.none.fl_str_mv application/pdf
dc.identifier.none.fl_str_mv 20160199
https://bspace.buid.ac.ae/handle/1234/1509
dc.language.none.fl_str_mv en
dc.publisher.none.fl_str_mv The British University in Dubai (BUiD)
dc.subject.none.fl_str_mv higher education
artificial neural networks
program administrators
institutional databases
academic performance
data mining
dc.title.none.fl_str_mv Data mining approach to predict student's selection of program majors
dc.type.none.fl_str_mv Dissertation
description Students in higher education do not have access to sufficient information when selecting their program major. Program administrators cannot easily predict majors that will be undersubscribed early enough to take corrective actions. At the same time, institutional databases have large volumes of data relating to student demographic profiles, course grades and academic performance. There is an opportunity to apply data mining to arrive at a model to predict student selection of a major. The nature of academic data relating to student majors is multi class and imbalanced – there is always a niche major with few students enrolled. Hence this needs special considerations within the area of data mining. The purpose of this study is to develop a data mining approach for predicting student's selection of program majors. The approach includes a methodology to manage data mining projects, sampling techniques to handle imbalanced data and multiclass data, a set of classification algorithms to predict and measures to evaluate performance of models. The methodology used in this study is the systematic literature review to source, evaluate and synthesize current information in this domain and the CRISP-DM to deploy data mining activities. Several data mining techniques such as data exploration, visualization, sampling and evaluation are presented and applied to the academic data. Datamining experiments are deployed in RapidMiner using Decision Trees, Naïve Bayes, Random Forest, Support Vector Machines, Artificial Neural Networks and Gradient Boosted Trees. Balanced sampling, SMOTE – oversampling of minority classes is used to compare results using the confusion matrix, F1-score and the balanced accuracy. Cross validation is applied to train and test performance of models. Naïve Bayes, Decision Trees offered the best predictions across the different sampling techniques. This study presents an approach to design and deploy a data mining project that can be used as a basis for developing systems to enable the selection of student majors.
id budr_61693fb08bb110a2b8a925dacdadd29b
identifier_str_mv 20160199
language_invalid_str_mv en
network_acronym_str budr
network_name_str The British University in Dubai repository
oai_identifier_str oai:bspace.buid.ac.ae:1234/1509
publishDate 2019
publisher.none.fl_str_mv The British University in Dubai (BUiD)
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling Data mining approach to predict student's selection of program majorsSIDDARTHA, SHARMILAhigher educationartificial neural networksprogram administratorsinstitutional databasesacademic performancedata miningStudents in higher education do not have access to sufficient information when selecting their program major. Program administrators cannot easily predict majors that will be undersubscribed early enough to take corrective actions. At the same time, institutional databases have large volumes of data relating to student demographic profiles, course grades and academic performance. There is an opportunity to apply data mining to arrive at a model to predict student selection of a major. The nature of academic data relating to student majors is multi class and imbalanced – there is always a niche major with few students enrolled. Hence this needs special considerations within the area of data mining. The purpose of this study is to develop a data mining approach for predicting student's selection of program majors. The approach includes a methodology to manage data mining projects, sampling techniques to handle imbalanced data and multiclass data, a set of classification algorithms to predict and measures to evaluate performance of models. The methodology used in this study is the systematic literature review to source, evaluate and synthesize current information in this domain and the CRISP-DM to deploy data mining activities. Several data mining techniques such as data exploration, visualization, sampling and evaluation are presented and applied to the academic data. Datamining experiments are deployed in RapidMiner using Decision Trees, Naïve Bayes, Random Forest, Support Vector Machines, Artificial Neural Networks and Gradient Boosted Trees. Balanced sampling, SMOTE – oversampling of minority classes is used to compare results using the confusion matrix, F1-score and the balanced accuracy. Cross validation is applied to train and test performance of models. Naïve Bayes, Decision Trees offered the best predictions across the different sampling techniques. This study presents an approach to design and deploy a data mining project that can be used as a basis for developing systems to enable the selection of student majors.The British University in Dubai (BUiD)2019-10-30T07:42:51Z2019-10-30T07:42:51Z2019-06Dissertationapplication/pdf20160199https://bspace.buid.ac.ae/handle/1234/1509enoai:bspace.buid.ac.ae:1234/15092021-09-22T13:23:08Z
spellingShingle Data mining approach to predict student's selection of program majors
SIDDARTHA, SHARMILA
higher education
artificial neural networks
program administrators
institutional databases
academic performance
data mining
title Data mining approach to predict student's selection of program majors
title_full Data mining approach to predict student's selection of program majors
title_fullStr Data mining approach to predict student's selection of program majors
title_full_unstemmed Data mining approach to predict student's selection of program majors
title_short Data mining approach to predict student's selection of program majors
title_sort Data mining approach to predict student's selection of program majors
topic higher education
artificial neural networks
program administrators
institutional databases
academic performance
data mining
url https://bspace.buid.ac.ae/handle/1234/1509