Human vs. AI: Assessing Scale Development for Perceived Risks of ChatGPT in Academic Settings
DOI:
https://doi.org/10.53103/cjess.v6i2.469Keywords:
Psychometric Scales, Reliability Assessment, Factor Analysis, Applications of Artificial Intelligence, Educational TechnologyAbstract
In this exploration, we conducted an in-depth comparative research of psychometric instrument development and compared the potential of human subject-matter professionals with artificial intelligence systems to conduct the analysis of perceived risks related to ChatGPT in the academic setting. The respondents (two professor-level experts on research methodology) and the AI systems ChatGPT and Claude, produced four different eighteen-item scales, all of them dealing with the triad of risk measures: psychological, ethical, and practical. A group of twenty experienced evaluators was used to evaluate the quality of the items using the accepted assessments of evaluation after which the resulting scales were provided to a sample of 120 respondents who represented a wide academic speciality. Further statistical analyses involved Exploratory Factor Analysis (EFA), as well as, measurement of reliability through Cronbach alpha, calculation of Average Variance Extracted (AVE) and calculation of composite reliability. Evidence showed that the scales produced by AI had significantly higher psychometric quality; in particular, the scale of Claude had the highest score of reliability with the α= 0.942 and the composite reliability value CR= 0.930. The ANOVA did not show statistically significant differences between the scales (F < 2.183, p < 0.168), however, the effect size analysis placed significant emphasis on some differences in the scales; in particular, the effect size obtained between Claude and Professor 1 was large (d < -1.578). Measurement Factor loading tests supported the construct validity of all measures, but AI-generated items indicated a slightly better factor structure. This further implied that the AI-generated scales, and particularly the scale created by Claude, were better than the human-created scales in regards to Clarity, Specificity and Overall Quality, but the human-created scales retained competition in regards to Relevance. All these outcomes mean that AI systems can produce high-quality psychometric tools that are equal or even superior to conventional human-created scales, meaning that they could change the effectiveness of instrument creation in psychological research and remain at high psychometric standards.
Downloads
References
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877-1901.
Cotton, D. R., Cotton, P. A., & Shipway, J. R. (2023). Chatting and cheating: Ensuring academic integrity in the era of ChatGPT. Innovations in Education and Teaching International, 61(2), 228-241. https://doi.org/10.1080/14703297.2023.2190148
Hinkin, T. R. (1998). A brief tutorial on the development of measures for use in survey questionnaires. Organizational Research Methods, 1(1), 104-121. https://doi.org/10.1177/109442819800100106
Kline, R. B. (2016). Principles and practice of structural equation modeling (4th ed.). Guilford Publications.
Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). McGraw-Hill.
Revelle, W. (2021). psych: Procedures for psychological, psychometric, and personality research [R package version 2.1.9]. Northwestern University. https://doi.org/10.32614/CRAN.package.psych
Raykov, T., & Marcoulides, G. A. (2011). Introduction to psychometric theory. Routledge/Taylor & Francis Group.
Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1-36. https://doi.org/10.18637/jss.v048.i02
Tavakol, M., & Dennick, R. (2011). Making sense of Cronbach's alpha. International Journal of Medical Education, 2, 53-55. https://doi.org/10.5116/ijme.4dfb.8dfd
Worthington, R. L., & Whittaker, T. A. (2006). Scale development research: A content analysis and recommendations for best practices. The Counseling Psychologist, 34(6), 806-838. https://doi.org/10.1177/0011000006288127
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Abdelouahd Bouzar, Khaoula El Idrissi, Samia Moustaghfir

This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published by CJESS are licensed under the Creative Commons Attribution 4.0 International License. This license permits third parties to copy, redistribute, remix, transform and build upon the original work provided that the original work and source is appropriately cited.
