A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Data

A Master of Science thesis in Computer Engineering by Tasneem Yousuf entitled, “A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Data”, submitted in May 2018. Thesis advisor is Dr. Imran Ahmed Zualkernan. Soft and hard copy available.

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Yousuf, Tasneem (author)
التنسيق: doctoralThesis
منشور في: 2018
الموضوعات:
الوصول للمادة أونلاين:http://hdl.handle.net/11073/9357
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513434979139584
author Yousuf, Tasneem
author_facet Yousuf, Tasneem
author_role author
dc.contributor.none.fl_str_mv Zualkernan, Imran
dc.creator.none.fl_str_mv Yousuf, Tasneem
dc.date.none.fl_str_mv 2018-06-05T06:52:02Z
2018-06-05T06:52:02Z
2018-05
dc.format.none.fl_str_mv application/pdf
dc.identifier.none.fl_str_mv 35.232-2018.12
http://hdl.handle.net/11073/9357
dc.language.none.fl_str_mv en_US
dc.subject.none.fl_str_mv educational analytics
association mining
rule discovery
Apriori
market basket analysis
developing countries
Education
Data processing
Data mining
dc.title.none.fl_str_mv A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Data
dc.type.none.fl_str_mv info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/doctoralThesis
description A Master of Science thesis in Computer Engineering by Tasneem Yousuf entitled, “A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Data”, submitted in May 2018. Thesis advisor is Dr. Imran Ahmed Zualkernan. Soft and hard copy available.
format doctoralThesis
id aus_d1c3b35780a83d22c38b202be0193ff5
identifier_str_mv 35.232-2018.12
language_invalid_str_mv en_US
network_acronym_str aus
network_name_str aus
oai_identifier_str oai:repository.aus.edu:11073/9357
publishDate 2018
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational DataYousuf, Tasneemeducational analyticsassociation miningrule discoveryApriorimarket basket analysisdeveloping countriesEducationData processingData miningA Master of Science thesis in Computer Engineering by Tasneem Yousuf entitled, “A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Data”, submitted in May 2018. Thesis advisor is Dr. Imran Ahmed Zualkernan. Soft and hard copy available.Recent availability of very large amounts of educational data in digital format often leads to data overload where it is difficult to determine important trends and patterns beyond those provided by traditional statistical techniques. Therefore, educational data mining (EDM) has emerged. Association mining is a type of EDM technique which is well-known for discovering relationships from data with high scale and velocity, but low variety and veracity. This analysis can be performed at the micro-level (e.g., for teachers), meso-level (e.g., for cohorts of schools), or at macro-levels (e.g., at region, province, or country level). This thesis proposes a methodology for the application of association mining to multi-tier sparse and error-ridden educational data. The methodology uses rule templates and is organized around the four analytical dimensions of people, process, environment, and outcomes. The methodology defines Extract Transform and Load (ETL) processes for this type of data and shows how data from lower levels is aggregated to baskets at higher levels. The proposed methodology was applied to data collected from a large-scale continuous professional development (CPD) process for 2,613 teachers in a developing country. The methodology was used to mine interesting rules which were evaluated using the objective metrics of Support, Confidence, and Lift to determine the quality of rules. The Confidence for each level was set to be at least 0.85. The results are that micro-level analysis (n = 2613 teachers) yielded little or no rules with a very low mean Support of 0.00345 (sd. = 0.00214) and mean Lift 6.98 (sd. = 4.63). The situation remained somewhat the same at the meso-level (n = 1391 schools) with a mean Support of 0.0059 (sd. = 0.00051) and mean Lift of 5.46 (sd. = 3.23). The results were significantly better at the macro level (n = 59 clusters) with a mean Support of 0.089 (sd. = 0.021) and mean Lift of 5.925 (sd. = 2.5). The mined rules discovered several anomalies and fidelity violations in the CPD process at various levels. The methodology was also useful in identifying small groups of teachers (6-8 teachers), schools (8-10 schools), and clusters (4-7 clusters) with common characteristics that can be further administered to help improve the CPD process.College of EngineeringDepartment of Computer Science and EngineeringMaster of Science in Computer Engineering (MSCoE)Zualkernan, Imran2018-06-05T06:52:02Z2018-06-05T06:52:02Z2018-05info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdf35.232-2018.12http://hdl.handle.net/11073/9357en_USoai:repository.aus.edu:11073/93572025-06-26T12:36:54Z
spellingShingle A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Data
Yousuf, Tasneem
educational analytics
association mining
rule discovery
Apriori
market basket analysis
developing countries
Education
Data processing
Data mining
status_str publishedVersion
title A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Data
title_full A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Data
title_fullStr A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Data
title_full_unstemmed A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Data
title_short A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Data
title_sort A Methodology of Rule Discovery from Large-scale Multi-tier Noisy Educational Data
topic educational analytics
association mining
rule discovery
Apriori
market basket analysis
developing countries
Education
Data processing
Data mining
url http://hdl.handle.net/11073/9357