The Washback of the IELTS Speaking Test on Chinese Test-Takers' Perceptions and Preparation Behaviours
Abstract
In the Chinese EFL context, a critical paradox exists: despite widespread enrolment in IELTS preparation courses, test-takers continue to struggle to achieve satisfactory speaking scores. This study investigates this issue through the lens of the washback effect—the influence of tests on learning. Drawing on a mixed-methods questionnaire completed by 236 Chinese IELTS test-takers, this study employed descriptive statistics, structural equation modelling (SEM), and thematic analysis. The quantitative results revealed a significant misalignment between test-takers' interpretations of the test design and the official assessment criteria. While the SEM indicated no significant direct relationship between key mediating factors (self-perceived proficiency, academic expectations, and individual differences) and test scores, qualitative findings offered a crucial explanation. Specifically, many learners' misinterpretations led them to rely on rote memorisation rather than developing the communicative competence the test aims to measure. This study extends washback research by demonstrating that mediating factors shape preparation behaviours through indirect pathways (processes) rather than directly determining outcomes (products). The findings underscore the need for test developers and educators to bridge the gap between test-takers' beliefs and assessment objectives to promote effective learning.
Downloads
References
Alderson, J. C., & Wall, D. (1993). Does washback exist? Applied Linguistics, 14(2), 115–129.
Allen, D. (2016). Investigating washback to the learner from the IELTS test in the Japanese tertiary context. Language Testing in Asia, 6(1), 1–20. https://doi.org/10.1186/s40468-016-0030-z
Bachman, L. F. (2005). Building and supporting a case for test use. Language Assessment Quarterly, 2(1), 1–34. https://doi.org/10.1080/15434300590941154
Bailey, K. M. (1996). Working for washback: A review of the washback concept in language testing. Language Testing, 13(3), 257–279. https://doi.org/10.1177/026553229601300303
Booth, D. K. (2018). The sociocultural activity of high stakes standardised language testing: TOEIC washback in a South Korean context (Vol. 12). Springer.
Brown, T. A. (2015). Confirmatory factor analysis for applied research (2nd ed.). Guilford Publications.
Chalhoub-Deville, M., & O’Sullivan, B. (2020). Validity: Theoretical development and integrated arguments. British Council.
Chappell, P., Yates, L., & Benson, P. (2019). Investigating test preparation practices: Reducing risks (IELTS Research Reports Online Series, No. 3). British Council, Cambridge Assessment English, and IDP: IELTS Australia.
Cheng, L. (2005). Changing language teaching through language testing: A washback study. Cambridge University Press.
Cheng, L., & Deluca, C. (2011). Voices from test-takers: Further evidence for language assessment validation and use. Educational Assessment, 16(2), 104–122. https://doi.org/10.1080/10627197.2011.584042
Cheng, L., Sun, Y., & Ma, J. (2015). Review of washback research literature within Kane’s argument-based validation framework. Language Teaching, 48(4), 436–470. https://doi.org/10.1017/S0261444815000233
Creswell, J. W., & Plano Clark, V. L. (2017). Designing and conducting mixed methods research (5th ed.). Sage Publications.
Crocker, L. (2006). Preparing Examinees for Test Taking: Guidelines for Test Developers and Test Users. In S. M. Downing & T. M. Haladyna (Eds.), Handbook of test development (pp. 115–128). Lawrence Erlbaum Associates Publisher
Dong, M. (2020). Structural relationship between learners’ perceptions of a test, learning practices, and learning outcomes: A study on the washback mechanism of a high-stakes test. Studies in Educational Evaluation, 64, 100824. https://doi.org/10.1016/j.stueduc.2019.100824
Ferman, I. (2004). The washback effect of an EFL national oral matriculation test on teaching and learning. In L. Cheng, Y. Watanabe, & A. Curtis (Eds.), Washback in language testing: Research contexts and methods (pp. 191–210). Lawrence Erlbaum Associates.
Fox, J., & Cheng, L. (2007). Did we take the same test? Differing accounts of the Ontario secondary school literacy test by first- and second-language test takers. Assessment in Education: Principles, Policy & Practice, 14(1), 9–26. https://doi.org/10.1080/09695940701272773
Gan, Z. (2009). IELTS preparation course and student IELTS performance: A case study in Hong Kong. Journal of Language Teaching and Research, 40(1), 23–41. https://doi.org/10.4304/jltr.40.1.23-41
Gosa, C. M. C. (2004). Investigating washback: A case study using student diaries [Doctoral dissertation, University of Lancaster].
Green, A. (2007). Washback to learning outcomes: A comparative study of IELTS preparation and university pre-sessional language courses. Assessment in Education: Principles, Policy & Practice, 14(1), 75–97. https://doi.org/10.1080/09695940701272880
Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E., & Tatham, R. L. (1998). Multivariate data analysis (5th ed.). Prentice Hall.
Hamp-Lyons, L. (1997). Washback, impact and validity: Ethical concerns. Language Testing, 14(3), 295–303. https://doi.org/10.1177/026553229701400306
Hawkey, R. (2006). Impact theory and practice (Studies in Language Testing 24). Cambridge University Press.
Hayes, B., & Read, J. (2004). IELTS test preparation in New Zealand: Preparing students for the IELTS academic module. In L. Cheng, Y. Watanabe, & A. Curtis (Eds.), Washback in language testing (pp. 119–134). Routledge.
He, S., Sénécal, A. M., Stansfield, L., & Suvorov, R. (2024). A scoping review of research on second-language test preparation. Language Testing, 41(1), 1–28. https://doi.org/10.1177/02655322231190788
Horwitz, E. K. (2010). Foreign and second language anxiety. Language Teaching, 43(2), 154–167. https://doi.org/10.1017/S0261444809990376
Hu, L.-T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55. https://doi.org/10.1080/10705519909540118
Hu, R., & Trenkic, D. (2021). The effects of coaching and repeated test-taking on Chinese candidates’ IELTS scores, their English proficiency, and subsequent academic achievement. International Journal of Bilingual Education and Bilingualism, 24(10), 1486–1501. https://doi.org/10.1080/13670050.2019.1691498
Huang, J., & Cowden, P. (2009). Are Chinese students really quiet, passive and surface learners? A cultural studies perspective. Comparative and International Education, 38(2), 1–15.
Hughes, A. (1993). Backwash and TOEFL 2000 [Unpublished manuscript]. University of Reading.
IELTS. (2019). IELTS performance for test-takers 2019. https://www.ielts.org/teaching-and-research/test-taker-performance
IELTS. (2020). IELTS test format. https://www.ielts.org/about-the-test/test-format#tab-6
IELTS. (2021). IELTS performance for test-takers 2021. https://www.ielts.org/for-researchers/test-statistics/test-taker-performance
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73. https://doi.org/10.1111/j.1745-3984.2012.00264.x
Kline, R. B. (2016). Principles and practice of structural equation modeling (4th ed.). Guilford Publications.
Larson-Hall, J. (2015). A guide to doing statistics in second language research using SPSS and R. Routledge.
Ma, H., & Chong, S. W. (2022). Predictability of IELTS in a high-stakes context: A mixed methods study of Chinese students’ perspectives on test preparation. Language Testing in Asia, 12(1), 1–18. https://doi.org/10.1186/s40468-021-00152-3
Ma, J. (2017). Understanding test preparation phenomenon through Chinese students’ journey towards success on high-stakes English language tests [Doctoral dissertation, Queen’s University]. QSpace.
Messick, S. (1981). Constructs and their vicissitudes in educational and psychological measurement. Psychological Bulletin, 89(3), 575–588. https://doi.org/10.1037/0033-2909.89.3.575
Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18(2), 5–11. https://doi.org/10.3102/0013189X018002005
Messick, S. (1996). Validity and washback in language testing. Language Testing, 13(3), 241–256. https://doi.org/10.1177/026553229601300302
Mickan, P., & Motteram, J. (2009). The preparation practices of IELTS candidates: Case studies (IELTS Research Reports, 10). IELTS Australia.
Mizutani, S. (2009). The mechanism of washback on teaching and learning [Unpublished doctoral thesis]. University of Auckland.
Popham, W. J. (1997). Consequential validity: Right concern—wrong concept. Educational Measurement: Issues and Practice, 16(2), 9–13. https://doi.org/10.1111/j.1745-3992.1997.tb00442.x
Sato, T. (2018). The impact of the Test of English for Academic Purposes (TEAP) on Japanese students’ English learning. JACET Journal, 62, 89–107.
Sato, T. (2019). An investigation of factors involved in Japanese students’ English learning behaviour during test preparation. Papers in Language Testing and Assessment, 8(1), 69–95. https://doi.org/10.1075/plta.8.1.04sat
Shih, C. (2007). A new washback model of students’ learning. Canadian Modern Language Review, 64(1), 135–161. https://doi.org/10.3138/cmlr.64.1.135
Sit, H. H. W. (2013). Characteristics of Chinese students’ learning styles. International Proceedings of Economics Development and Research, 62, 36–40.
Tsang, C. L. H. (2017). Examining washback on learning from a sociocultural perspective: The case of a graded approach to English language testing in Hong Kong [Unpublished master’s thesis]. University College London.
Tsang, C. L., & Isaacs, T. (2022). Hong Kong secondary students’ perspectives on selecting test difficulty level and learner washback: Effects of a graded approach to assessment. Language Testing, 39(2), 212–238. https://doi.org/10.1177/02655322211050600
Wall, D. (1997). Impact and washback in language testing, Encyclopedia of language and education, 7, 291-302.
Wei, W. (2017). A critical review of washback studies: Hypothesis and evidence. In Revisiting EFL assessment (pp. 49–67). Springer.
Xie, Q. (2011). Is test taker perception of assessment related to construct validity? International Journal of Testing, 11(4), 324–348. https://doi.org/10.1080/15305058.2011.589018
Xie, Q. (2013). Does test preparation work? Implications for score validity. Language Assessment Quarterly, 10(2), 196–218. https://doi.org/10.1080/15434303.2012.721423
Xie, Q., & Andrews, S. (2013). Do test design and uses influence test preparation? Testing a model of washback with structural equation modeling. Language Testing, 30(1), 49–70. https://doi.org/10.1177/0265532212442634
Yu, G., He, L., Rea-Dickins, P., Kiely, R., Lu, Y., Zhang, J., Zhang, Y., Xu, S., & Fang, L. (2017). Preparing for the speaking tasks of the TOEFL iBT® test: An investigation of the journeys of Chinese test takers (ETS Research Report Series). Wiley.
Zhan, Y., & Wan, Z. H. (2016). Test takers’ beliefs and experiences of a high-stakes computer-based English listening and speaking test. RELC Journal, 47(3), 363–376. https://doi.org/10.1177/0033688215626498
Zhang, H., & Bournot-Trites, M. (2021). The long-term washback effects of the National Matriculation English Test on college English learning in China: Tertiary student perspectives. Studies in Educational Evaluation, 68, 100977. https://doi.org/10.1016/j.stueduc.2021.100977















