The details of the publicly available dataset.
<div><p>Knowledge tracing can reveal students’ level of knowledge in relation to their learning performance. Recently, plenty of machine learning algorithms have been proposed to exploit to implement knowledge tracing and have achieved promising outcomes. However, most of the previous ap...
Saved in:
| Main Author: | |
|---|---|
| Other Authors: | |
| Published: |
2025
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1852016801639038976 |
|---|---|
| author | Ailian Gao (20629841) |
| author2 | Zenglei Liu (20629838) |
| author2_role | author |
| author_facet | Ailian Gao (20629841) Zenglei Liu (20629838) |
| author_role | author |
| dc.creator.none.fl_str_mv | Ailian Gao (20629841) Zenglei Liu (20629838) |
| dc.date.none.fl_str_mv | 2025-09-09T17:32:49Z |
| dc.identifier.none.fl_str_mv | 10.1371/journal.pone.0330433.t001 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/dataset/The_details_of_the_publicly_available_dataset_/30088486 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Cancer Science Policy Biological Sciences not elsewhere classified students &# 8217 integrate temporal information conducted comparison experiments bidirectional lstm model series forecasting pipeline machine learning algorithms proposed lstkt model proposed informer model publicly available dataset individual knowledge states informer </ p achieved promising outcomes short sequence prediction probability sparse self implement knowledge tracing long sequence time sparse self series prediction knowledge tracing sequence time tracing studies time stamps time stamp ednet dataset assistments2017 dataset assistments2009 dataset current knowledge target exercises previous approaches learning performance extensively utilized existing models exercising recordings decoder architecture canonical encoder attention module attention mechanism answering records 82 %. 81 %. |
| dc.title.none.fl_str_mv | The details of the publicly available dataset. |
| dc.type.none.fl_str_mv | Dataset info:eu-repo/semantics/publishedVersion dataset |
| description | <div><p>Knowledge tracing can reveal students’ level of knowledge in relation to their learning performance. Recently, plenty of machine learning algorithms have been proposed to exploit to implement knowledge tracing and have achieved promising outcomes. However, most of the previous approaches were unable to cope with long sequence time-series prediction, which is more valuable than short sequence prediction that is extensively utilized in current knowledge-tracing studies. In this study, we propose a long-sequence time-series forecasting pipeline for knowledge tracing that leverages both time stamp and exercise sequences. Firstly, we introduce a bidirectional LSTM model to tackle the embeddings of exercise-answering records. Secondly, we incorporate both the students’ exercising recordings and the time stamps into a vector for each record. Next, a sequence of vectors is taken as input for the proposed Informer model, which utilizes the probability-sparse self-attention mechanism. Note that the probability sparse self-attention module can address the quadratic computational complexity issue of the canonical encoder-decoder architecture. Finally, we integrate temporal information and individual knowledge states to implement the answers to a sequence of target exercises. To evaluate the performance of the proposed LSTKT model, we conducted comparison experiments with state-of-the-art knowledge tracing algorithms on a publicly available dataset. This model demonstrates quantitative improvements over existing models. In the Assistments2009 dataset, it achieved an accuracy of 78.49% and an AUC of 78.81%. For the Assistments2017 dataset, it reached an accuracy of 74.22% and an AUC of 72.82%. In the EdNet dataset, it attained an accuracy of 68.17% and an AUC of 70.78%.</p></div> |
| eu_rights_str_mv | openAccess |
| id | Manara_157ecbb77e85febc420b48a6c19d232a |
| identifier_str_mv | 10.1371/journal.pone.0330433.t001 |
| network_acronym_str | Manara |
| network_name_str | ManaraRepo |
| oai_identifier_str | oai:figshare.com:article/30088486 |
| publishDate | 2025 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | The details of the publicly available dataset.Ailian Gao (20629841)Zenglei Liu (20629838)CancerScience PolicyBiological Sciences not elsewhere classifiedstudents &# 8217integrate temporal informationconducted comparison experimentsbidirectional lstm modelseries forecasting pipelinemachine learning algorithmsproposed lstkt modelproposed informer modelpublicly available datasetindividual knowledge statesinformer </ pachieved promising outcomesshort sequence predictionprobability sparse selfimplement knowledge tracinglong sequence timesparse selfseries predictionknowledge tracingsequence timetracing studiestime stampstime stampednet datasetassistments2017 datasetassistments2009 datasetcurrent knowledgetarget exercisesprevious approacheslearning performanceextensively utilizedexisting modelsexercising recordingsdecoder architecturecanonical encoderattention moduleattention mechanismanswering records82 %.81 %.<div><p>Knowledge tracing can reveal students’ level of knowledge in relation to their learning performance. Recently, plenty of machine learning algorithms have been proposed to exploit to implement knowledge tracing and have achieved promising outcomes. However, most of the previous approaches were unable to cope with long sequence time-series prediction, which is more valuable than short sequence prediction that is extensively utilized in current knowledge-tracing studies. In this study, we propose a long-sequence time-series forecasting pipeline for knowledge tracing that leverages both time stamp and exercise sequences. Firstly, we introduce a bidirectional LSTM model to tackle the embeddings of exercise-answering records. Secondly, we incorporate both the students’ exercising recordings and the time stamps into a vector for each record. Next, a sequence of vectors is taken as input for the proposed Informer model, which utilizes the probability-sparse self-attention mechanism. Note that the probability sparse self-attention module can address the quadratic computational complexity issue of the canonical encoder-decoder architecture. Finally, we integrate temporal information and individual knowledge states to implement the answers to a sequence of target exercises. To evaluate the performance of the proposed LSTKT model, we conducted comparison experiments with state-of-the-art knowledge tracing algorithms on a publicly available dataset. This model demonstrates quantitative improvements over existing models. In the Assistments2009 dataset, it achieved an accuracy of 78.49% and an AUC of 78.81%. For the Assistments2017 dataset, it reached an accuracy of 74.22% and an AUC of 72.82%. In the EdNet dataset, it attained an accuracy of 68.17% and an AUC of 70.78%.</p></div>2025-09-09T17:32:49ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.1371/journal.pone.0330433.t001https://figshare.com/articles/dataset/The_details_of_the_publicly_available_dataset_/30088486CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/300884862025-09-09T17:32:49Z |
| spellingShingle | The details of the publicly available dataset. Ailian Gao (20629841) Cancer Science Policy Biological Sciences not elsewhere classified students &# 8217 integrate temporal information conducted comparison experiments bidirectional lstm model series forecasting pipeline machine learning algorithms proposed lstkt model proposed informer model publicly available dataset individual knowledge states informer </ p achieved promising outcomes short sequence prediction probability sparse self implement knowledge tracing long sequence time sparse self series prediction knowledge tracing sequence time tracing studies time stamps time stamp ednet dataset assistments2017 dataset assistments2009 dataset current knowledge target exercises previous approaches learning performance extensively utilized existing models exercising recordings decoder architecture canonical encoder attention module attention mechanism answering records 82 %. 81 %. |
| status_str | publishedVersion |
| title | The details of the publicly available dataset. |
| title_full | The details of the publicly available dataset. |
| title_fullStr | The details of the publicly available dataset. |
| title_full_unstemmed | The details of the publicly available dataset. |
| title_short | The details of the publicly available dataset. |
| title_sort | The details of the publicly available dataset. |
| topic | Cancer Science Policy Biological Sciences not elsewhere classified students &# 8217 integrate temporal information conducted comparison experiments bidirectional lstm model series forecasting pipeline machine learning algorithms proposed lstkt model proposed informer model publicly available dataset individual knowledge states informer </ p achieved promising outcomes short sequence prediction probability sparse self implement knowledge tracing long sequence time sparse self series prediction knowledge tracing sequence time tracing studies time stamps time stamp ednet dataset assistments2017 dataset assistments2009 dataset current knowledge target exercises previous approaches learning performance extensively utilized existing models exercising recordings decoder architecture canonical encoder attention module attention mechanism answering records 82 %. 81 %. |