Score Distributions by Rater.

<div><p>Automated evaluation systems (AESs) for spoken language assessment are increasingly adopted in global educational settings, yet their validity in non-Western contexts remains underexplored. This study addresses this gap by examining three widely used Chinese-developed AES tools i...

Full description

Saved in:
Bibliographic Details
Main Author: Tianhui Chen (3044838) (author)
Other Authors: Sanjun Sun (20957288) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<div><p>Automated evaluation systems (AESs) for spoken language assessment are increasingly adopted in global educational settings, yet their validity in non-Western contexts remains underexplored. This study addresses this gap by examining three widely used Chinese-developed AES tools in their assessment of spoken English proficiency among 30 Chinese undergraduates. The study employed an IELTS-adapted speaking test, assessed simultaneously by AESs and human raters, with scoring alignment analyzed through intra-class correlation coefficients, Pearson correlations, and linear regression. Results revealed that two systems demonstrated strong agreement with human ratings, while the third exhibited systematic score inflation, likely due to algorithmic discrepancies and limited consideration of nuanced language features. Our findings suggest the potential of AESs as valuable complements to traditional language assessment methods, while highlighting the necessity for calibration and validation procedures. This research has significant implications for integrating AESs in educational contexts, particularly in English as a Foreign Language (EFL) settings, where they can enhance efficiency and standardization.</p></div>