File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Individualized feedback to raters : effects on rating severity, inconsistency, and bias in the context of Chinese as a second language writing assessment
Title | Individualized feedback to raters : effects on rating severity, inconsistency, and bias in the context of Chinese as a second language writing assessment |
---|---|
Authors | |
Advisors | |
Issue Date | 2018 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Huang, J. [黄敬]. (2018). Individualized feedback to raters : effects on rating severity, inconsistency, and bias in the context of Chinese as a second language writing assessment. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Performance-based language assessment commonly requires human raters to assign scores to language leaners’ performances. The subjectivity in human ratings inevitably introduces rater variability that has been identified as a main source of construct-irrelevant variance. Individualized feedback has been used in rater training to limit such variance and increase the reliability of performance-based language assessment. This training method, however, has not yet proved to be effective in improving raters’ rating outcomes (i.e., severity, inconsistency, and bias towards a rating scale category). Moreover, few previous studies have investigated the effects of feedback frequency at a given time period on raters’ rating outcomes.
The present study examined the immediate and retention effects of individualized feedback on raters’ rating outcomes, as well as the effects of raters’ perceptions of the individualized feedback on rating outcomes. The participants were 93 native Chinese speakers without previous rating experience, and they were randomly assigned to one of three treatment groups. The three groups differed in the way of receiving individualized feedback at a given time period: (a) control group receiving no feedback, (b) single-feedback group receiving the feedback once, and (c) double-feedback group receiving the feedback twice. Each participant rated 100 writing scripts on Day 1 as the pre-feedback ratings, and received one of the feedback treatments on Day 2. The post-feedback ratings were conducted immediately after the feedback session on Day 2 by assigning each participant 100 new writing scripts to rate. This was followed by a questionnaire and interview for the double-feedback and single-feedback groups on the same day. Raters’ retention of the feedback was measured by assigning each participant 100 new writing scripts to rate as the delayed post-feedback ratings after one week. Raters’ rating outcomes were yielded through multi-faceted Rasch modeling. One-way ANCOVAs, repeated measures ANOVAs, Kruskal-Wallis H tests, and two-way ANCOVAs were run to test the hypotheses.
Based on the results of this study, the following main conclusions were made. First, individualized feedback significantly affected raters’ rating severity and inconsistency. The rating severities of the double-feedback and single-feedback groups were superior to the control group. In addition, the rating inconsistency of the double-feedback group was superior to the single-feedback group. Second, with regard to the retention effects, individualized feedback was found to be beneficial to raters from the double-feedback group to retain their improvements in rating severity. It also helped raters from the single-feedback group to retain their improvements in rating inconsistency. Third, in terms of the effects of raters’ perceptions of the individualized feedback, the results revealed that raters’ perceptions of usefulness of the individualized feedback affected rating severity, inconsistency, and bias towards coherence. Furthermore, raters’ perceptions of recall of the individualized feedback during subsequent ratings affected rating inconsistency. In addition, raters’ perceptions of incorporation of the individualized feedback into subsequent ratings affected rating severity. These findings may shed light on the application of individualized feedback in the designs of face-to-face and online training programs. |
Degree | Doctor of Philosophy |
Subject | Second language acquisition - Ability testing Educational tests and measurements |
Dept/Program | Education |
Persistent Identifier | http://hdl.handle.net/10722/295612 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Chen, G | - |
dc.contributor.advisor | Loh, EKY | - |
dc.contributor.author | Huang, Jing | - |
dc.contributor.author | 黄敬 | - |
dc.date.accessioned | 2021-02-02T03:05:16Z | - |
dc.date.available | 2021-02-02T03:05:16Z | - |
dc.date.issued | 2018 | - |
dc.identifier.citation | Huang, J. [黄敬]. (2018). Individualized feedback to raters : effects on rating severity, inconsistency, and bias in the context of Chinese as a second language writing assessment. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/295612 | - |
dc.description.abstract | Performance-based language assessment commonly requires human raters to assign scores to language leaners’ performances. The subjectivity in human ratings inevitably introduces rater variability that has been identified as a main source of construct-irrelevant variance. Individualized feedback has been used in rater training to limit such variance and increase the reliability of performance-based language assessment. This training method, however, has not yet proved to be effective in improving raters’ rating outcomes (i.e., severity, inconsistency, and bias towards a rating scale category). Moreover, few previous studies have investigated the effects of feedback frequency at a given time period on raters’ rating outcomes. The present study examined the immediate and retention effects of individualized feedback on raters’ rating outcomes, as well as the effects of raters’ perceptions of the individualized feedback on rating outcomes. The participants were 93 native Chinese speakers without previous rating experience, and they were randomly assigned to one of three treatment groups. The three groups differed in the way of receiving individualized feedback at a given time period: (a) control group receiving no feedback, (b) single-feedback group receiving the feedback once, and (c) double-feedback group receiving the feedback twice. Each participant rated 100 writing scripts on Day 1 as the pre-feedback ratings, and received one of the feedback treatments on Day 2. The post-feedback ratings were conducted immediately after the feedback session on Day 2 by assigning each participant 100 new writing scripts to rate. This was followed by a questionnaire and interview for the double-feedback and single-feedback groups on the same day. Raters’ retention of the feedback was measured by assigning each participant 100 new writing scripts to rate as the delayed post-feedback ratings after one week. Raters’ rating outcomes were yielded through multi-faceted Rasch modeling. One-way ANCOVAs, repeated measures ANOVAs, Kruskal-Wallis H tests, and two-way ANCOVAs were run to test the hypotheses. Based on the results of this study, the following main conclusions were made. First, individualized feedback significantly affected raters’ rating severity and inconsistency. The rating severities of the double-feedback and single-feedback groups were superior to the control group. In addition, the rating inconsistency of the double-feedback group was superior to the single-feedback group. Second, with regard to the retention effects, individualized feedback was found to be beneficial to raters from the double-feedback group to retain their improvements in rating severity. It also helped raters from the single-feedback group to retain their improvements in rating inconsistency. Third, in terms of the effects of raters’ perceptions of the individualized feedback, the results revealed that raters’ perceptions of usefulness of the individualized feedback affected rating severity, inconsistency, and bias towards coherence. Furthermore, raters’ perceptions of recall of the individualized feedback during subsequent ratings affected rating inconsistency. In addition, raters’ perceptions of incorporation of the individualized feedback into subsequent ratings affected rating severity. These findings may shed light on the application of individualized feedback in the designs of face-to-face and online training programs. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Second language acquisition - Ability testing | - |
dc.subject.lcsh | Educational tests and measurements | - |
dc.title | Individualized feedback to raters : effects on rating severity, inconsistency, and bias in the context of Chinese as a second language writing assessment | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Education | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2019 | - |
dc.identifier.mmsid | 991044340095803414 | - |