Equivalence of Parallel Tests in a Basic Statistics Course in Higher Education Using Classical Measurement Theory

Authors

  • Frank Quansah University of Education, Ghana
  • Andrews Cobbinah University of Cape Coast, Ghana

DOI:

https://doi.org/10.53103/cjess.v1i2.11

Keywords:

Parallel Test, Validity, Congeneric Form, Tau-Equivalent, Classically Parallel

Abstract

Developing and administering parallel test forms to students in higher education offsets the cost of having assessment scores that have low validity. This research demonstrated the validity and equivalence of parallel tests in a Basic Statistics course. Among other things, the study: (1) established and compared the item specifications of the items on the different test forms developed, and (2) determined the extent of parallelism of the alternate test forms. Three carefully designed alternate forms of achievement tests (using item specification and test specification table) were administered to 504 second-year students. In addition, academic resilience scale was administered to the same students to help ascertain the criterion validity of the alternate forms. The study revealed some level of similarities in the statistical specifications of the alternate test forms. Further analysis showed that the three alternate test forms developed were congeneric forms of parallelism. The authors concluded that developing classical parallel forms of the test is not feasible, but having congeneric parallel test forms offset the cost of having less valid scores which do not represent students’ attainment levels. Faculty members are encouraged to make use of parallel test forms in assessing students in higher education.

Downloads

Download data is not yet available.

References

Allen, M. J., & Yen, W. M. (2002). Introduction to measurement theory. Illinois: Waveland Press.

Case, S., & Swanson, D. B. (1998). Constructing written test questions for the basic and clinical sciences. Philadelphia, PA: National Board of Medical Examiners.

Cassidy, S. (2016). The academic resilience scale (ARS-30): A new multidimensional construct measure. Educational Psychology, 7(1), 1-11. https://doi.org/10.3389/fpsyg.2016.01787

Crocker, L., & Algina, J. (2008). Introduction to classical and modern test theory. Ohio: Cengage Learning Press.

Danner, D. (2016). Reliability – The precision of a measurement. GESIS Survey Guidelines. Mannheim, Germany: GESIS – Leibniz Institute for the Social Sciences. https://doi: 10.15465/gesis-sg_en_011

Diego, A. (2017). Friends with benefits: causes and effects of learners’ cheating practices during examination. IAFOR Journal of Education, 5(2), 121–138. https://files.eric.ed.gov/fulltext/EJ1156266.pdf

Downing, S. M. (2003). Validity: On the meaningful interpretation of assessment data. Medical Education, 37, 830-837. doi:10.1046/j.1365-2923.2003. 01594.x

Feldt, L. S. (1980). A test of the hypothesis that Cronbach’s alpha reliability coefficient is the same for two tests administered to the same sample. Psychometrika, 45, 99-105. https://doi.org/10.1007/BF02293600

Feldt, L. S., & Brennan, R. L. (1989). Reliability. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 105-146). Phoenix, AZ: Ornyx.

Field, A. (2009). Discovering statistics using SPSS (3rd ed.). Thousand Oaks, CA: SAGE Publications.

Forkuor, J. B. Amarteifio, J., & Attoh D. O. (2019). Students’ perception of cheating and the best time to cheat during examinations. The Urban Review, 51(3), 424–443. https://doi.org/10.1007/s11256-018-0491-8

Fowell, S. L., Southgate, L. J., & Bligh, J. G. (1999). Evaluating assessment: The missing link? Medical Education, 33, 276-281.

https://doi.org/10.1046/j.1365-2923.1999.00405.x

Graham, J. M. (2006). Congeneric and (essentially) Tau-equivalent estimates of score reliability. Educational and Psychological Measurement, 66(6), 930-944. https://doi.org/10.1177/0013164406288165

Kane, M. (2006). Content-related validity evidence in test development. In S. M. Downing, & T. M. Haladyna (Eds.), Handbook of test development (pp. 131-153). Mahwah, NJ: Lawrence Erlbaum Associates.

Liepmann, D., Beauducel, A., Brocke, B., & Amthauer, R. (2007). I-S-T 2000 R - Intelligenz-Struktur-Test 2000 R (2nd ed.). Göttingen: Hogrefe.

Malau-Aduli, B. S., Walls, J., & Zimitat, C. (2012). Validity, reliability and equivalence of parallel examinations in a university setting. Creative Education, 3, 923-930. http://dx.doi.org/10.4236/ce.2012.326140

McCabe, D. L. Butterfield, K. D., & Trevi˜no, L. K. (2006). Academic dishonesty in graduate business programs: prevalence, causes, and proposed action. Academy of Management Learning and Education, 5(3), 294–305. https://doi.org/10.5465/amle.2006.22697018

Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13-104). New York: American Council on Education and Macmillan.

Nitko, J. A. (2001). Educational assessment of students. New Jersey: Prentice Hall.

Norcini, J., Anderson, B., Bollela, V., Burch, V., Costa, M. J., Duvivier, R., Galbraith, R., Hays, R., Kent, A., Perrott, V., & Roberts, T. (2011). Criteria for good assessment: Consensus statement and recommendations from the Ottawa 2010 Conference. Medical Teacher, 33, 206-214. https://doi.org/10.3109/0142159X.2011.551559

Odongo, D. A., Agyemang, E., & Forkuor, J. (2021). Innovative approaches to cheating: An exploration of examination cheating techniques among tertiary students. Hindawi Education Research International, 1, 1-7. https://doi.org/10.1155/2021/6639429

Pallant, J. (2010). SPSS survival manual. A step by step guide to data analysis using SPSS (4th ed.). Crow’s Nest: Allen & Unwin.

Schmale, H. (2001). Berufseignungstest (BET). Tabellenband (4th revised and enlarged ed.). Bern: Hans Huber.

Schuwirth, L., Colliver, J., Gruppen, L., Kreiter, C., Mennin, S., Onishi, H., Pangaro, L., Ringsted, C., Swanson, D., Van der Vleuten, C. P. M., & Wagner-Menghin, M. (2011). Research in assessment: Consensus statement and recommendations from Ottawa 2010 Conference. Medical Teacher, 33, 224-233. https://doi.org/10.3109/0142159X.2011.551558

Spaan, M. (2013). Test and item specification development. Language Assessment Quarterly, 3(1), 71-79. https://doi.org/10.1207/s15434311laq0301_5

Teixeira, A. A. C., & Rocha, M. F. (2010). Cheating by economics and business undergraduate students: an exploratory international assessment. Higher Education, 59(6), 663–701. https://doi.org/10.1007/s10734-009-9274-1.

Wagner-Menghin, M., Preusche, I., & Schmidts, M. (2013). The effects of reusing written test items: A study using the Rasch model. ISRN Education, 1, 1-7. http://dx.doi.org/10.1155/2013/585420

Downloads

Published

2021-11-01

How to Cite

Quansah, F., & Cobbinah, A. (2021). Equivalence of Parallel Tests in a Basic Statistics Course in Higher Education Using Classical Measurement Theory. Canadian Journal of Educational and Social Studies, 1(2), 13–28. https://doi.org/10.53103/cjess.v1i2.11

Issue

Section

Articles