Equivalence of Parallel Tests in a Basic Statistics Course in Higher Education Using Classical Measurement Theory
DOI:
https://doi.org/10.53103/cjess.v1i2.11Keywords:
Parallel Test, Validity, Congeneric Form, Tau-Equivalent, Classically ParallelAbstract
Developing and administering parallel test forms to students in higher education offsets the cost of having assessment scores that have low validity. This research demonstrated the validity and equivalence of parallel tests in a Basic Statistics course. Among other things, the study: (1) established and compared the item specifications of the items on the different test forms developed, and (2) determined the extent of parallelism of the alternate test forms. Three carefully designed alternate forms of achievement tests (using item specification and test specification table) were administered to 504 second-year students. In addition, academic resilience scale was administered to the same students to help ascertain the criterion validity of the alternate forms. The study revealed some level of similarities in the statistical specifications of the alternate test forms. Further analysis showed that the three alternate test forms developed were congeneric forms of parallelism. The authors concluded that developing classical parallel forms of the test is not feasible, but having congeneric parallel test forms offset the cost of having less valid scores which do not represent students’ attainment levels. Faculty members are encouraged to make use of parallel test forms in assessing students in higher education.
Downloads
References
Allen, M. J., & Yen, W. M. (2002). Introduction to measurement theory. Illinois: Waveland Press.
Case, S., & Swanson, D. B. (1998). Constructing written test questions for the basic and clinical sciences. Philadelphia, PA: National Board of Medical Examiners.
Cassidy, S. (2016). The academic resilience scale (ARS-30): A new multidimensional construct measure. Educational Psychology, 7(1), 1-11. https://doi.org/10.3389/fpsyg.2016.01787
Crocker, L., & Algina, J. (2008). Introduction to classical and modern test theory. Ohio: Cengage Learning Press.
Danner, D. (2016). Reliability – The precision of a measurement. GESIS Survey Guidelines. Mannheim, Germany: GESIS – Leibniz Institute for the Social Sciences. https://doi: 10.15465/gesis-sg_en_011
Diego, A. (2017). Friends with benefits: causes and effects of learners’ cheating practices during examination. IAFOR Journal of Education, 5(2), 121–138. https://files.eric.ed.gov/fulltext/EJ1156266.pdf
Downing, S. M. (2003). Validity: On the meaningful interpretation of assessment data. Medical Education, 37, 830-837. doi:10.1046/j.1365-2923.2003. 01594.x
Feldt, L. S. (1980). A test of the hypothesis that Cronbach’s alpha reliability coefficient is the same for two tests administered to the same sample. Psychometrika, 45, 99-105. https://doi.org/10.1007/BF02293600
Feldt, L. S., & Brennan, R. L. (1989). Reliability. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 105-146). Phoenix, AZ: Ornyx.
Field, A. (2009). Discovering statistics using SPSS (3rd ed.). Thousand Oaks, CA: SAGE Publications.
Forkuor, J. B. Amarteifio, J., & Attoh D. O. (2019). Students’ perception of cheating and the best time to cheat during examinations. The Urban Review, 51(3), 424–443. https://doi.org/10.1007/s11256-018-0491-8
Fowell, S. L., Southgate, L. J., & Bligh, J. G. (1999). Evaluating assessment: The missing link? Medical Education, 33, 276-281.
https://doi.org/10.1046/j.1365-2923.1999.00405.x
Graham, J. M. (2006). Congeneric and (essentially) Tau-equivalent estimates of score reliability. Educational and Psychological Measurement, 66(6), 930-944. https://doi.org/10.1177/0013164406288165
Kane, M. (2006). Content-related validity evidence in test development. In S. M. Downing, & T. M. Haladyna (Eds.), Handbook of test development (pp. 131-153). Mahwah, NJ: Lawrence Erlbaum Associates.
Liepmann, D., Beauducel, A., Brocke, B., & Amthauer, R. (2007). I-S-T 2000 R - Intelligenz-Struktur-Test 2000 R (2nd ed.). Göttingen: Hogrefe.
Malau-Aduli, B. S., Walls, J., & Zimitat, C. (2012). Validity, reliability and equivalence of parallel examinations in a university setting. Creative Education, 3, 923-930. http://dx.doi.org/10.4236/ce.2012.326140
McCabe, D. L. Butterfield, K. D., & Trevi˜no, L. K. (2006). Academic dishonesty in graduate business programs: prevalence, causes, and proposed action. Academy of Management Learning and Education, 5(3), 294–305. https://doi.org/10.5465/amle.2006.22697018
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13-104). New York: American Council on Education and Macmillan.
Nitko, J. A. (2001). Educational assessment of students. New Jersey: Prentice Hall.
Norcini, J., Anderson, B., Bollela, V., Burch, V., Costa, M. J., Duvivier, R., Galbraith, R., Hays, R., Kent, A., Perrott, V., & Roberts, T. (2011). Criteria for good assessment: Consensus statement and recommendations from the Ottawa 2010 Conference. Medical Teacher, 33, 206-214. https://doi.org/10.3109/0142159X.2011.551559
Odongo, D. A., Agyemang, E., & Forkuor, J. (2021). Innovative approaches to cheating: An exploration of examination cheating techniques among tertiary students. Hindawi Education Research International, 1, 1-7. https://doi.org/10.1155/2021/6639429
Pallant, J. (2010). SPSS survival manual. A step by step guide to data analysis using SPSS (4th ed.). Crow’s Nest: Allen & Unwin.
Schmale, H. (2001). Berufseignungstest (BET). Tabellenband (4th revised and enlarged ed.). Bern: Hans Huber.
Schuwirth, L., Colliver, J., Gruppen, L., Kreiter, C., Mennin, S., Onishi, H., Pangaro, L., Ringsted, C., Swanson, D., Van der Vleuten, C. P. M., & Wagner-Menghin, M. (2011). Research in assessment: Consensus statement and recommendations from Ottawa 2010 Conference. Medical Teacher, 33, 224-233. https://doi.org/10.3109/0142159X.2011.551558
Spaan, M. (2013). Test and item specification development. Language Assessment Quarterly, 3(1), 71-79. https://doi.org/10.1207/s15434311laq0301_5
Teixeira, A. A. C., & Rocha, M. F. (2010). Cheating by economics and business undergraduate students: an exploratory international assessment. Higher Education, 59(6), 663–701. https://doi.org/10.1007/s10734-009-9274-1.
Wagner-Menghin, M., Preusche, I., & Schmidts, M. (2013). The effects of reusing written test items: A study using the Rasch model. ISRN Education, 1, 1-7. http://dx.doi.org/10.1155/2013/585420
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 Frank Quansah, Andrews Cobbinah
This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published by CJESS are licensed under the Creative Commons Attribution 4.0 International License. This license permits third parties to copy, redistribute, remix, transform and build upon the original work provided that the original work and source is appropriately cited.