ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models

Recently, researchers have shown concern about the ChatGPT-derived answers. Here, we conducted a series of tests using ChatGPT by individual researcher at multi-country level to understand the pattern of its answer accuracy, reproducibility, answer length, plagiarism, and in-depth using two question...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Manojit, Bhattacharya (author)
مؤلفون آخرون: Pal, Soumen (author), Chatterjee, Srijan (author), Alshammari, Abdulrahman (author), Albekairi, Thamer H. (author), Jagga, Supriya (author), Ige Ohimain, Elijah (author), Zayed, Hatem (author), Byrareddy, Siddappa N. (author), Lee, Sang-Soo (author), Wen, Zhi-Hong (author), Agoramoorthy, Govindasamy (author), Bhattacharya, Prosun (author), Chakraborty, Chiranjib (author)
التنسيق: article
منشور في: 2024
الموضوعات:
الوصول للمادة أونلاين:http://dx.doi.org/10.1016/j.crbiot.2024.100194
https://www.sciencedirect.com/science/article/pii/S2590262824000200
http://hdl.handle.net/10576/56120
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1857415085914324992
author Manojit, Bhattacharya
author2 Pal, Soumen
Chatterjee, Srijan
Alshammari, Abdulrahman
Albekairi, Thamer H.
Jagga, Supriya
Ige Ohimain, Elijah
Zayed, Hatem
Byrareddy, Siddappa N.
Lee, Sang-Soo
Wen, Zhi-Hong
Agoramoorthy, Govindasamy
Bhattacharya, Prosun
Chakraborty, Chiranjib
author2_role author
author
author
author
author
author
author
author
author
author
author
author
author
author_facet Manojit, Bhattacharya
Pal, Soumen
Chatterjee, Srijan
Alshammari, Abdulrahman
Albekairi, Thamer H.
Jagga, Supriya
Ige Ohimain, Elijah
Zayed, Hatem
Byrareddy, Siddappa N.
Lee, Sang-Soo
Wen, Zhi-Hong
Agoramoorthy, Govindasamy
Bhattacharya, Prosun
Chakraborty, Chiranjib
author_role author
dc.creator.none.fl_str_mv Manojit, Bhattacharya
Pal, Soumen
Chatterjee, Srijan
Alshammari, Abdulrahman
Albekairi, Thamer H.
Jagga, Supriya
Ige Ohimain, Elijah
Zayed, Hatem
Byrareddy, Siddappa N.
Lee, Sang-Soo
Wen, Zhi-Hong
Agoramoorthy, Govindasamy
Bhattacharya, Prosun
Chakraborty, Chiranjib
dc.date.none.fl_str_mv 2024-06-12T10:59:04Z
2024-03-02
dc.format.none.fl_str_mv application/pdf
dc.identifier.none.fl_str_mv http://dx.doi.org/10.1016/j.crbiot.2024.100194
Bhattacharya, M., Pal, S., Chatterjee, S., Alshammari, A., Albekairi, T. H., Jagga, S., ... & Chakraborty, C. (2024). ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-university level: A pattern of responses of generative artificial intelligence or large language models. Current Research in Biotechnology, 100194.
https://www.sciencedirect.com/science/article/pii/S2590262824000200
http://hdl.handle.net/10576/56120
7
2590-2628
dc.language.none.fl_str_mv en
dc.publisher.none.fl_str_mv Elsevier
dc.rights.none.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv ChatGPT
Accuracy
Reproducibility
Plagiarism
Answer length
dc.title.none.fl_str_mv ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models
dc.type.none.fl_str_mv Article
info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/article
description Recently, researchers have shown concern about the ChatGPT-derived answers. Here, we conducted a series of tests using ChatGPT by individual researcher at multi-country level to understand the pattern of its answer accuracy, reproducibility, answer length, plagiarism, and in-depth using two questionnaires (the first set with 15 MCQs and the second 15 KBQ). Among 15 MCQ-generated answers, 13 ± 70 were correct (Median : 82.5; Coefficient variance : 4.85), 3 ± 0.77 were incorrect (Median: 3, Coefficient variance: 25.81), and 1 to 10 were reproducible, and 11 to 15 were not. Among 15 KBQ, the length of each question (in words) is about 294.5 ± 97.60 (mean range varies from 138.7 to 438.09), and the mean similarity index (in words) is about 29.53 ± 11.40 (Coefficient variance: 38.62) for each question. The statistical models were also developed using analyzed parameters of answers. The study shows a pattern of ChatGPT-derive answers with correctness and incorrectness and urges for an error-free, next-generation LLM to avoid users’ misguidance.
eu_rights_str_mv openAccess
format article
id qu_6acea3c4a67239d915e60d1a3c1137f4
identifier_str_mv Bhattacharya, M., Pal, S., Chatterjee, S., Alshammari, A., Albekairi, T. H., Jagga, S., ... & Chakraborty, C. (2024). ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-university level: A pattern of responses of generative artificial intelligence or large language models. Current Research in Biotechnology, 100194.
7
2590-2628
language_invalid_str_mv en
network_acronym_str qu
network_name_str Qatar University repository
oai_identifier_str oai:qspace.qu.edu.qa:10576/56120
publishDate 2024
publisher.none.fl_str_mv Elsevier
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
spelling ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language modelsManojit, BhattacharyaPal, SoumenChatterjee, SrijanAlshammari, AbdulrahmanAlbekairi, Thamer H.Jagga, SupriyaIge Ohimain, ElijahZayed, HatemByrareddy, Siddappa N.Lee, Sang-SooWen, Zhi-HongAgoramoorthy, GovindasamyBhattacharya, ProsunChakraborty, ChiranjibChatGPTAccuracyReproducibilityPlagiarismAnswer lengthRecently, researchers have shown concern about the ChatGPT-derived answers. Here, we conducted a series of tests using ChatGPT by individual researcher at multi-country level to understand the pattern of its answer accuracy, reproducibility, answer length, plagiarism, and in-depth using two questionnaires (the first set with 15 MCQs and the second 15 KBQ). Among 15 MCQ-generated answers, 13 ± 70 were correct (Median : 82.5; Coefficient variance : 4.85), 3 ± 0.77 were incorrect (Median: 3, Coefficient variance: 25.81), and 1 to 10 were reproducible, and 11 to 15 were not. Among 15 KBQ, the length of each question (in words) is about 294.5 ± 97.60 (mean range varies from 138.7 to 438.09), and the mean similarity index (in words) is about 29.53 ± 11.40 (Coefficient variance: 38.62) for each question. The statistical models were also developed using analyzed parameters of answers. The study shows a pattern of ChatGPT-derive answers with correctness and incorrectness and urges for an error-free, next-generation LLM to avoid users’ misguidance.This work was funded the by Researchers Supporting Project number (RSP2024R491), King Saud University, Riyadh, Saudi Arabia.Elsevier2024-06-12T10:59:04Z2024-03-02Articleinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://dx.doi.org/10.1016/j.crbiot.2024.100194Bhattacharya, M., Pal, S., Chatterjee, S., Alshammari, A., Albekairi, T. H., Jagga, S., ... & Chakraborty, C. (2024). ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-university level: A pattern of responses of generative artificial intelligence or large language models. Current Research in Biotechnology, 100194.https://www.sciencedirect.com/science/article/pii/S2590262824000200http://hdl.handle.net/10576/5612072590-2628enhttp://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/openAccessoai:qspace.qu.edu.qa:10576/561202024-07-23T15:53:58Z
spellingShingle ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models
Manojit, Bhattacharya
ChatGPT
Accuracy
Reproducibility
Plagiarism
Answer length
status_str publishedVersion
title ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models
title_full ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models
title_fullStr ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models
title_full_unstemmed ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models
title_short ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models
title_sort ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models
topic ChatGPT
Accuracy
Reproducibility
Plagiarism
Answer length
url http://dx.doi.org/10.1016/j.crbiot.2024.100194
https://www.sciencedirect.com/science/article/pii/S2590262824000200
http://hdl.handle.net/10576/56120