File Download
There are no files associated with this item.
Supplementary
-
Citations:
- Appears in Collections:
Conference Paper: Combining quantitative and qualitative measures to validate a group speaking assessment test
Title | Combining quantitative and qualitative measures to validate a group speaking assessment test |
---|---|
Authors | |
Issue Date | 2016 |
Citation | The 2nd International Conference on Linguistics and Language Studies (ICLLS 2016), Hong Kong, 23-24 June 2016. How to Cite? |
Abstract | All too often, the shift from a norm-referenced to criterion-reference assessment results in tests that reflect holistic and (teacher-biased) expectations of student performance without actually determining whether the criteria hold sufficient standards of validity and reliability, resulting in considerable inter-rater variance. This is particularly felt in L2 contexts where students struggle to achieve grades under criteria that are not produced in the same language (or spirit) as their L1, and where teachers from different backgrounds may interpret individual criteria according to their ideologies and beliefs. However, most validation studies only focus on quantitative matters (losing the personal, holistic focus) or focus on qualitative concerns (missing the general picture). The present study validates a criterion-referenced group tutorial discussion speaking assessment for undergraduate EAP at a leading university in Hong Kong in terms of inter-rater variance and criterion validity. I present three complementary quantitative measures (Intraclass Correlation Coefficient, Cronbach's Alpha and Exploratory Factor Analysis) which suggest a number of criteria could safely be removed from the rubric, and show that the grading of some criteria frequently overlaps that of other criteria. However, in attempting to explain the reasons behind the statistical results, qualitative interviews with test raters suggest that 1) raters bring with them their own interpretations of the rubric criteria and 2) there are specific linguistic considerations regarding individual test takers' performance and rater variance on said performance. The results have implications for improving the validity and reliability of in-house produced criterion-reference assessment rubrics, and hopefully the paper serves as a 'how-to' for language assessment practitioners. |
Persistent Identifier | http://hdl.handle.net/10722/227761 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Crosthwaite, PR | - |
dc.contributor.author | Boynton, SD | - |
dc.contributor.author | Cole III, SF | - |
dc.date.accessioned | 2016-07-18T09:12:40Z | - |
dc.date.available | 2016-07-18T09:12:40Z | - |
dc.date.issued | 2016 | - |
dc.identifier.citation | The 2nd International Conference on Linguistics and Language Studies (ICLLS 2016), Hong Kong, 23-24 June 2016. | - |
dc.identifier.uri | http://hdl.handle.net/10722/227761 | - |
dc.description.abstract | All too often, the shift from a norm-referenced to criterion-reference assessment results in tests that reflect holistic and (teacher-biased) expectations of student performance without actually determining whether the criteria hold sufficient standards of validity and reliability, resulting in considerable inter-rater variance. This is particularly felt in L2 contexts where students struggle to achieve grades under criteria that are not produced in the same language (or spirit) as their L1, and where teachers from different backgrounds may interpret individual criteria according to their ideologies and beliefs. However, most validation studies only focus on quantitative matters (losing the personal, holistic focus) or focus on qualitative concerns (missing the general picture). The present study validates a criterion-referenced group tutorial discussion speaking assessment for undergraduate EAP at a leading university in Hong Kong in terms of inter-rater variance and criterion validity. I present three complementary quantitative measures (Intraclass Correlation Coefficient, Cronbach's Alpha and Exploratory Factor Analysis) which suggest a number of criteria could safely be removed from the rubric, and show that the grading of some criteria frequently overlaps that of other criteria. However, in attempting to explain the reasons behind the statistical results, qualitative interviews with test raters suggest that 1) raters bring with them their own interpretations of the rubric criteria and 2) there are specific linguistic considerations regarding individual test takers' performance and rater variance on said performance. The results have implications for improving the validity and reliability of in-house produced criterion-reference assessment rubrics, and hopefully the paper serves as a 'how-to' for language assessment practitioners. | - |
dc.language | eng | - |
dc.relation.ispartof | International Conference on Linguistics and Language Studies, ICLLS 2016 | - |
dc.title | Combining quantitative and qualitative measures to validate a group speaking assessment test | - |
dc.type | Conference_Paper | - |
dc.identifier.email | Crosthwaite, PR: drprc80@hku.hk | - |
dc.identifier.email | Boynton, SD: sboynton@hku.hk | - |
dc.identifier.email | Cole III, SF: samcole@hku.hk | - |
dc.identifier.authority | Crosthwaite, PR=rp01961 | - |
dc.identifier.hkuros | 258879 | - |