The details of the publicly available dataset.

<div><p>Knowledge tracing can reveal students’ level of knowledge in relation to their learning performance. Recently, plenty of machine learning algorithms have been proposed to exploit to implement knowledge tracing and have achieved promising outcomes. However, most of the previous ap...

Full description

Saved in:

Bibliographic Details
Main Author:	Ailian Gao (20629841) (author)
Other Authors:	Zenglei Liu (20629838) (author)
Published:	2025
Subjects:	Cancer Science Policy Biological Sciences not elsewhere classified students &# 8217 integrate temporal information conducted comparison experiments bidirectional lstm model series forecasting pipeline machine learning algorithms proposed lstkt model proposed informer model publicly available dataset individual knowledge states informer </ p achieved promising outcomes short sequence prediction probability sparse self implement knowledge tracing long sequence time sparse self series prediction knowledge tracing sequence time tracing studies time stamps time stamp ednet dataset assistments2017 dataset assistments2009 dataset current knowledge target exercises previous approaches learning performance extensively utilized existing models exercising recordings decoder architecture canonical encoder attention module attention mechanism answering records 82 %. 81 %.
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1852016801639038976
author	Ailian Gao (20629841)
author2	Zenglei Liu (20629838)
author2_role	author
author_facet	Ailian Gao (20629841) Zenglei Liu (20629838)
author_role	author
dc.creator.none.fl_str_mv	Ailian Gao (20629841) Zenglei Liu (20629838)
dc.date.none.fl_str_mv	2025-09-09T17:32:49Z
dc.identifier.none.fl_str_mv	10.1371/journal.pone.0330433.t001
dc.relation.none.fl_str_mv	https://figshare.com/articles/dataset/The_details_of_the_publicly_available_dataset_/30088486
dc.rights.none.fl_str_mv	CC BY 4.0 info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv	Cancer Science Policy Biological Sciences not elsewhere classified students &# 8217 integrate temporal information conducted comparison experiments bidirectional lstm model series forecasting pipeline machine learning algorithms proposed lstkt model proposed informer model publicly available dataset individual knowledge states informer </ p achieved promising outcomes short sequence prediction probability sparse self implement knowledge tracing long sequence time sparse self series prediction knowledge tracing sequence time tracing studies time stamps time stamp ednet dataset assistments2017 dataset assistments2009 dataset current knowledge target exercises previous approaches learning performance extensively utilized existing models exercising recordings decoder architecture canonical encoder attention module attention mechanism answering records 82 %. 81 %.
dc.title.none.fl_str_mv	The details of the publicly available dataset.
dc.type.none.fl_str_mv	Dataset info:eu-repo/semantics/publishedVersion dataset
description	<div><p>Knowledge tracing can reveal students’ level of knowledge in relation to their learning performance. Recently, plenty of machine learning algorithms have been proposed to exploit to implement knowledge tracing and have achieved promising outcomes. However, most of the previous approaches were unable to cope with long sequence time-series prediction, which is more valuable than short sequence prediction that is extensively utilized in current knowledge-tracing studies. In this study, we propose a long-sequence time-series forecasting pipeline for knowledge tracing that leverages both time stamp and exercise sequences. Firstly, we introduce a bidirectional LSTM model to tackle the embeddings of exercise-answering records. Secondly, we incorporate both the students’ exercising recordings and the time stamps into a vector for each record. Next, a sequence of vectors is taken as input for the proposed Informer model, which utilizes the probability-sparse self-attention mechanism. Note that the probability sparse self-attention module can address the quadratic computational complexity issue of the canonical encoder-decoder architecture. Finally, we integrate temporal information and individual knowledge states to implement the answers to a sequence of target exercises. To evaluate the performance of the proposed LSTKT model, we conducted comparison experiments with state-of-the-art knowledge tracing algorithms on a publicly available dataset. This model demonstrates quantitative improvements over existing models. In the Assistments2009 dataset, it achieved an accuracy of 78.49% and an AUC of 78.81%. For the Assistments2017 dataset, it reached an accuracy of 74.22% and an AUC of 72.82%. In the EdNet dataset, it attained an accuracy of 68.17% and an AUC of 70.78%.</p></div>
eu_rights_str_mv	openAccess
id	Manara_157ecbb77e85febc420b48a6c19d232a
identifier_str_mv	10.1371/journal.pone.0330433.t001
network_acronym_str	Manara
network_name_str	ManaraRepo
oai_identifier_str	oai:figshare.com:article/30088486
publishDate	2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv	CC BY 4.0
spelling	The details of the publicly available dataset.Ailian Gao (20629841)Zenglei Liu (20629838)CancerScience PolicyBiological Sciences not elsewhere classifiedstudents &# 8217integrate temporal informationconducted comparison experimentsbidirectional lstm modelseries forecasting pipelinemachine learning algorithmsproposed lstkt modelproposed informer modelpublicly available datasetindividual knowledge statesinformer </ pachieved promising outcomesshort sequence predictionprobability sparse selfimplement knowledge tracinglong sequence timesparse selfseries predictionknowledge tracingsequence timetracing studiestime stampstime stampednet datasetassistments2017 datasetassistments2009 datasetcurrent knowledgetarget exercisesprevious approacheslearning performanceextensively utilizedexisting modelsexercising recordingsdecoder architecturecanonical encoderattention moduleattention mechanismanswering records82 %.81 %.<div><p>Knowledge tracing can reveal students’ level of knowledge in relation to their learning performance. Recently, plenty of machine learning algorithms have been proposed to exploit to implement knowledge tracing and have achieved promising outcomes. However, most of the previous approaches were unable to cope with long sequence time-series prediction, which is more valuable than short sequence prediction that is extensively utilized in current knowledge-tracing studies. In this study, we propose a long-sequence time-series forecasting pipeline for knowledge tracing that leverages both time stamp and exercise sequences. Firstly, we introduce a bidirectional LSTM model to tackle the embeddings of exercise-answering records. Secondly, we incorporate both the students’ exercising recordings and the time stamps into a vector for each record. Next, a sequence of vectors is taken as input for the proposed Informer model, which utilizes the probability-sparse self-attention mechanism. Note that the probability sparse self-attention module can address the quadratic computational complexity issue of the canonical encoder-decoder architecture. Finally, we integrate temporal information and individual knowledge states to implement the answers to a sequence of target exercises. To evaluate the performance of the proposed LSTKT model, we conducted comparison experiments with state-of-the-art knowledge tracing algorithms on a publicly available dataset. This model demonstrates quantitative improvements over existing models. In the Assistments2009 dataset, it achieved an accuracy of 78.49% and an AUC of 78.81%. For the Assistments2017 dataset, it reached an accuracy of 74.22% and an AUC of 72.82%. In the EdNet dataset, it attained an accuracy of 68.17% and an AUC of 70.78%.</p></div>2025-09-09T17:32:49ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.1371/journal.pone.0330433.t001https://figshare.com/articles/dataset/The_details_of_the_publicly_available_dataset_/30088486CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/300884862025-09-09T17:32:49Z
spellingShingle	The details of the publicly available dataset. Ailian Gao (20629841) Cancer Science Policy Biological Sciences not elsewhere classified students &# 8217 integrate temporal information conducted comparison experiments bidirectional lstm model series forecasting pipeline machine learning algorithms proposed lstkt model proposed informer model publicly available dataset individual knowledge states informer </ p achieved promising outcomes short sequence prediction probability sparse self implement knowledge tracing long sequence time sparse self series prediction knowledge tracing sequence time tracing studies time stamps time stamp ednet dataset assistments2017 dataset assistments2009 dataset current knowledge target exercises previous approaches learning performance extensively utilized existing models exercising recordings decoder architecture canonical encoder attention module attention mechanism answering records 82 %. 81 %.
status_str	publishedVersion
title	The details of the publicly available dataset.
title_full	The details of the publicly available dataset.
title_fullStr	The details of the publicly available dataset.
title_full_unstemmed	The details of the publicly available dataset.
title_short	The details of the publicly available dataset.
title_sort	The details of the publicly available dataset.
topic	Cancer Science Policy Biological Sciences not elsewhere classified students &# 8217 integrate temporal information conducted comparison experiments bidirectional lstm model series forecasting pipeline machine learning algorithms proposed lstkt model proposed informer model publicly available dataset individual knowledge states informer </ p achieved promising outcomes short sequence prediction probability sparse self implement knowledge tracing long sequence time sparse self series prediction knowledge tracing sequence time tracing studies time stamps time stamp ednet dataset assistments2017 dataset assistments2009 dataset current knowledge target exercises previous approaches learning performance extensively utilized existing models exercising recordings decoder architecture canonical encoder attention module attention mechanism answering records 82 %. 81 %.

The details of the publicly available dataset.

Similar Items