ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models

Recently, researchers have shown concern about the ChatGPT-derived answers. Here, we conducted a series of tests using ChatGPT by individual researcher at multi-country level to understand the pattern of its answer accuracy, reproducibility, answer length, plagiarism, and in-depth using two question...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Manojit, Bhattacharya (author)
مؤلفون آخرون:	Pal, Soumen (author), Chatterjee, Srijan (author), Alshammari, Abdulrahman (author), Albekairi, Thamer H. (author), Jagga, Supriya (author), Ige Ohimain, Elijah (author), Zayed, Hatem (author), Byrareddy, Siddappa N. (author), Lee, Sang-Soo (author), Wen, Zhi-Hong (author), Agoramoorthy, Govindasamy (author), Bhattacharya, Prosun (author), Chakraborty, Chiranjib (author)
التنسيق:	article
منشور في:	2024
الموضوعات:	ChatGPT Accuracy Reproducibility Plagiarism Answer length
الوصول للمادة أونلاين:	http://dx.doi.org/10.1016/j.crbiot.2024.100194 https://www.sciencedirect.com/science/article/pii/S2590262824000200 http://hdl.handle.net/10576/56120
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

الوصف
الملخص:	Recently, researchers have shown concern about the ChatGPT-derived answers. Here, we conducted a series of tests using ChatGPT by individual researcher at multi-country level to understand the pattern of its answer accuracy, reproducibility, answer length, plagiarism, and in-depth using two questionnaires (the first set with 15 MCQs and the second 15 KBQ). Among 15 MCQ-generated answers, 13 ± 70 were correct (Median : 82.5; Coefficient variance : 4.85), 3 ± 0.77 were incorrect (Median: 3, Coefficient variance: 25.81), and 1 to 10 were reproducible, and 11 to 15 were not. Among 15 KBQ, the length of each question (in words) is about 294.5 ± 97.60 (mean range varies from 138.7 to 438.09), and the mean similarity index (in words) is about 29.53 ± 11.40 (Coefficient variance: 38.62) for each question. The statistical models were also developed using analyzed parameters of answers. The study shows a pattern of ChatGPT-derive answers with correctness and incorrectness and urges for an error-free, next-generation LLM to avoid users’ misguidance.

ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models

مواد مشابهة