Data Sheet 2_Evaluation of the accuracy and repeatability of Deepseek V3, Doubao, and Kimi1.5 in answering knowledge-related queries about chronic non-bacterial osteitis.zip

Background<p>There are significant differences in the diagnosis and treatment of chronic non-bacterial osteitis (CNO), and there is an urgent need for health education efforts to enhance awareness of this condition. Deepseek V3, Doubao, and Kimi1.5 are highly popular language models in China t...

Full description

Saved in:

Bibliographic Details
Main Author:	Zhenxing Zhu (509266) (author)
Other Authors:	Jun Xie (53987) (author), Longxin Zhou (22328278) (author), Chaoran Yang (11009245) (author), Feng Li (30515) (author)
Published:	2025
Subjects:	Knowledge Representation and Machine Learning chronic non-bacterial osteitis Chinese AI chatbots knowledge retrieval Deepseek V3 Doubao Kimi1.5
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1852016193286701056
author	Zhenxing Zhu (509266)
author2	Jun Xie (53987) Longxin Zhou (22328278) Chaoran Yang (11009245) Feng Li (30515)
author2_role	author author author author
author_facet	Zhenxing Zhu (509266) Jun Xie (53987) Longxin Zhou (22328278) Chaoran Yang (11009245) Feng Li (30515)
author_role	author
dc.creator.none.fl_str_mv	Zhenxing Zhu (509266) Jun Xie (53987) Longxin Zhou (22328278) Chaoran Yang (11009245) Feng Li (30515)
dc.date.none.fl_str_mv	2025-09-29T05:25:26Z
dc.identifier.none.fl_str_mv	10.3389/frai.2025.1629149.s002
dc.relation.none.fl_str_mv	https://figshare.com/articles/dataset/Data_Sheet_2_Evaluation_of_the_accuracy_and_repeatability_of_Deepseek_V3_Doubao_and_Kimi1_5_in_answering_knowledge-related_queries_about_chronic_non-bacterial_osteitis_zip/30230389
dc.rights.none.fl_str_mv	CC BY 4.0 info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv	Knowledge Representation and Machine Learning chronic non-bacterial osteitis Chinese AI chatbots knowledge retrieval Deepseek V3 Doubao Kimi1.5
dc.title.none.fl_str_mv	Data Sheet 2_Evaluation of the accuracy and repeatability of Deepseek V3, Doubao, and Kimi1.5 in answering knowledge-related queries about chronic non-bacterial osteitis.zip
dc.type.none.fl_str_mv	Dataset info:eu-repo/semantics/publishedVersion dataset
description	Background<p>There are significant differences in the diagnosis and treatment of chronic non-bacterial osteitis (CNO), and there is an urgent need for health education efforts to enhance awareness of this condition. Deepseek V3, Doubao, and Kimi1.5 are highly popular language models in China that can provide knowledge related to diseases. This article aims to investigate the accuracy and reproducibility of the responses provided by these three artificial intelligence (AI) language models in answering questions about CNO.</p>Methods<p>According to the latest expert consensus, 16 questions related to CNO were collected. The three AI language models were separately asked these questions at three different times. The answers were independently evaluated by two orthopedic experts.</p>Results<p>Among the responses of the three AI models to 16 CNO-related questions across three rounds of testing, only Doubao received “Completely incorrect” ratings (accounting for 6.25%) in the third round of scoring by Reviewer 2. During the answering process, Doubao had the shortest response time and provided the most words in its answers. In the first and third rounds of scoring by the first expert, Kimi scored the highest (3.938 ± 0.342, 3.875 ± 0.873), while in the second round, Doubao scored the highest (3.875 ± 0.5). In the second round of scoring by the second expert, Doubao received the highest score (3.812 ± 0.403). In the first and third rounds, Kimi1.5 received the highest score (3.812 ± 0.602, 3.812 ± 0.704).</p>Conclusion<p>Deepseek V3, Doubao, and Kimi1.5 are capable of answering most questions related to CNO with good accuracy and reproducibility, showing no significant differences.</p>
eu_rights_str_mv	openAccess
id	Manara_229e4e148d406e739879fbbd080c8b99
identifier_str_mv	10.3389/frai.2025.1629149.s002
network_acronym_str	Manara
network_name_str	ManaraRepo
oai_identifier_str	oai:figshare.com:article/30230389
publishDate	2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv	CC BY 4.0
spelling	Data Sheet 2_Evaluation of the accuracy and repeatability of Deepseek V3, Doubao, and Kimi1.5 in answering knowledge-related queries about chronic non-bacterial osteitis.zipZhenxing Zhu (509266)Jun Xie (53987)Longxin Zhou (22328278)Chaoran Yang (11009245)Feng Li (30515)Knowledge Representation and Machine Learningchronic non-bacterial osteitisChinese AI chatbotsknowledge retrievalDeepseek V3DoubaoKimi1.5Background<p>There are significant differences in the diagnosis and treatment of chronic non-bacterial osteitis (CNO), and there is an urgent need for health education efforts to enhance awareness of this condition. Deepseek V3, Doubao, and Kimi1.5 are highly popular language models in China that can provide knowledge related to diseases. This article aims to investigate the accuracy and reproducibility of the responses provided by these three artificial intelligence (AI) language models in answering questions about CNO.</p>Methods<p>According to the latest expert consensus, 16 questions related to CNO were collected. The three AI language models were separately asked these questions at three different times. The answers were independently evaluated by two orthopedic experts.</p>Results<p>Among the responses of the three AI models to 16 CNO-related questions across three rounds of testing, only Doubao received “Completely incorrect” ratings (accounting for 6.25%) in the third round of scoring by Reviewer 2. During the answering process, Doubao had the shortest response time and provided the most words in its answers. In the first and third rounds of scoring by the first expert, Kimi scored the highest (3.938 ± 0.342, 3.875 ± 0.873), while in the second round, Doubao scored the highest (3.875 ± 0.5). In the second round of scoring by the second expert, Doubao received the highest score (3.812 ± 0.403). In the first and third rounds, Kimi1.5 received the highest score (3.812 ± 0.602, 3.812 ± 0.704).</p>Conclusion<p>Deepseek V3, Doubao, and Kimi1.5 are capable of answering most questions related to CNO with good accuracy and reproducibility, showing no significant differences.</p>2025-09-29T05:25:26ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.3389/frai.2025.1629149.s002https://figshare.com/articles/dataset/Data_Sheet_2_Evaluation_of_the_accuracy_and_repeatability_of_Deepseek_V3_Doubao_and_Kimi1_5_in_answering_knowledge-related_queries_about_chronic_non-bacterial_osteitis_zip/30230389CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/302303892025-09-29T05:25:26Z
spellingShingle	Data Sheet 2_Evaluation of the accuracy and repeatability of Deepseek V3, Doubao, and Kimi1.5 in answering knowledge-related queries about chronic non-bacterial osteitis.zip Zhenxing Zhu (509266) Knowledge Representation and Machine Learning chronic non-bacterial osteitis Chinese AI chatbots knowledge retrieval Deepseek V3 Doubao Kimi1.5
status_str	publishedVersion
title	Data Sheet 2_Evaluation of the accuracy and repeatability of Deepseek V3, Doubao, and Kimi1.5 in answering knowledge-related queries about chronic non-bacterial osteitis.zip
title_full	Data Sheet 2_Evaluation of the accuracy and repeatability of Deepseek V3, Doubao, and Kimi1.5 in answering knowledge-related queries about chronic non-bacterial osteitis.zip
title_fullStr	Data Sheet 2_Evaluation of the accuracy and repeatability of Deepseek V3, Doubao, and Kimi1.5 in answering knowledge-related queries about chronic non-bacterial osteitis.zip
title_full_unstemmed	Data Sheet 2_Evaluation of the accuracy and repeatability of Deepseek V3, Doubao, and Kimi1.5 in answering knowledge-related queries about chronic non-bacterial osteitis.zip
title_short	Data Sheet 2_Evaluation of the accuracy and repeatability of Deepseek V3, Doubao, and Kimi1.5 in answering knowledge-related queries about chronic non-bacterial osteitis.zip
title_sort	Data Sheet 2_Evaluation of the accuracy and repeatability of Deepseek V3, Doubao, and Kimi1.5 in answering knowledge-related queries about chronic non-bacterial osteitis.zip
topic	Knowledge Representation and Machine Learning chronic non-bacterial osteitis Chinese AI chatbots knowledge retrieval Deepseek V3 Doubao Kimi1.5

Data Sheet 2_Evaluation of the accuracy and repeatability of Deepseek V3, Doubao, and Kimi1.5 in answering knowledge-related queries about chronic non-bacterial osteitis.zip

Similar Items