Predicting Dropouts among a Homogeneous Population using a Data Mining Approach

Student retention is one the biggest challenges facing academic institutions worldwide. Failure to retain students not only affects the student in a negative way but also hinders institutional quality and reputation. While there are several theoretical perspectives of retention, which study the fact...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: BILQUISE, GHAZALA (author)
منشور في: 2019
الموضوعات:
الوصول للمادة أونلاين:https://bspace.buid.ac.ae/handle/1234/1448
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1862980612396154880
author BILQUISE, GHAZALA
author_facet BILQUISE, GHAZALA
author_role author
dc.creator.none.fl_str_mv BILQUISE, GHAZALA
dc.date.none.fl_str_mv 2019-08-27T06:59:42Z
2019-08-27T06:59:42Z
2019-03
dc.format.none.fl_str_mv application/pdf
dc.identifier.none.fl_str_mv 2016228206
https://bspace.buid.ac.ae/handle/1234/1448
dc.language.none.fl_str_mv en
dc.publisher.none.fl_str_mv The British University in Dubai (BUiD)
dc.subject.none.fl_str_mv student retention
homogeneous population
data mining
United Arab Emirates (UAE)
academic institutions
machine learning
machine learning algorithms
dc.title.none.fl_str_mv Predicting Dropouts among a Homogeneous Population using a Data Mining Approach
dc.type.none.fl_str_mv Dissertation
description Student retention is one the biggest challenges facing academic institutions worldwide. Failure to retain students not only affects the student in a negative way but also hinders institutional quality and reputation. While there are several theoretical perspectives of retention, which study the factors that cause students to drop out, more recent studies rely on a data mining and machine learning approach to explore the problem of retention. In this research, we present a novel data mining approach to predict retention among a homogeneous group of students, with similar social and cultural background, at an academic institution based in the UAE. Our model successfully identifes dropouts at an early stage. It provides an early warning system that enables the institution to promptly intervene with assertive measures. Moreover, our model also effectively determines the top predictive variables of retention. Several researchers study retention by focusing on student persistence from one term to another while our study builds a predictive model to study retention until graduation. Moreover, other works use additional student data for predictions, thereby reducing the dataset size, which is counter productive to data mining. Our research relies solely on pre-college and college performance data available in the institutional database. Our research reveals that the Gradient Boosted Trees is a robust algorithm that predicts dropouts with an accuracy of 79.31% and AUC of 88.4% using only pre-enrollment data. High School Average and High School stream of study are observed to be the top predictive variables of on-time graduation when a student joins college. Our study also reveals that ensemble machine learning algorithms are more reliable and outperform standard algorithms.
id budr_653a22dda7a71073db6f363d7cb190f2
identifier_str_mv 2016228206
language_invalid_str_mv en
network_acronym_str budr
network_name_str The British University in Dubai repository
oai_identifier_str oai:bspace.buid.ac.ae:1234/1448
publishDate 2019
publisher.none.fl_str_mv The British University in Dubai (BUiD)
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling Predicting Dropouts among a Homogeneous Population using a Data Mining ApproachBILQUISE, GHAZALAstudent retentionhomogeneous populationdata miningUnited Arab Emirates (UAE)academic institutionsmachine learningmachine learning algorithmsStudent retention is one the biggest challenges facing academic institutions worldwide. Failure to retain students not only affects the student in a negative way but also hinders institutional quality and reputation. While there are several theoretical perspectives of retention, which study the factors that cause students to drop out, more recent studies rely on a data mining and machine learning approach to explore the problem of retention. In this research, we present a novel data mining approach to predict retention among a homogeneous group of students, with similar social and cultural background, at an academic institution based in the UAE. Our model successfully identifes dropouts at an early stage. It provides an early warning system that enables the institution to promptly intervene with assertive measures. Moreover, our model also effectively determines the top predictive variables of retention. Several researchers study retention by focusing on student persistence from one term to another while our study builds a predictive model to study retention until graduation. Moreover, other works use additional student data for predictions, thereby reducing the dataset size, which is counter productive to data mining. Our research relies solely on pre-college and college performance data available in the institutional database. Our research reveals that the Gradient Boosted Trees is a robust algorithm that predicts dropouts with an accuracy of 79.31% and AUC of 88.4% using only pre-enrollment data. High School Average and High School stream of study are observed to be the top predictive variables of on-time graduation when a student joins college. Our study also reveals that ensemble machine learning algorithms are more reliable and outperform standard algorithms.The British University in Dubai (BUiD)2019-08-27T06:59:42Z2019-08-27T06:59:42Z2019-03Dissertationapplication/pdf2016228206https://bspace.buid.ac.ae/handle/1234/1448enoai:bspace.buid.ac.ae:1234/14482021-09-22T12:30:54Z
spellingShingle Predicting Dropouts among a Homogeneous Population using a Data Mining Approach
BILQUISE, GHAZALA
student retention
homogeneous population
data mining
United Arab Emirates (UAE)
academic institutions
machine learning
machine learning algorithms
title Predicting Dropouts among a Homogeneous Population using a Data Mining Approach
title_full Predicting Dropouts among a Homogeneous Population using a Data Mining Approach
title_fullStr Predicting Dropouts among a Homogeneous Population using a Data Mining Approach
title_full_unstemmed Predicting Dropouts among a Homogeneous Population using a Data Mining Approach
title_short Predicting Dropouts among a Homogeneous Population using a Data Mining Approach
title_sort Predicting Dropouts among a Homogeneous Population using a Data Mining Approach
topic student retention
homogeneous population
data mining
United Arab Emirates (UAE)
academic institutions
machine learning
machine learning algorithms
url https://bspace.buid.ac.ae/handle/1234/1448