File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Can large language models write reflectively

TitleCan large language models write reflectively
Authors
KeywordsChatGPT
Generative language model
Natural language processing
Reflective writing
Issue Date2023
Citation
Computers and Education: Artificial Intelligence, 2023, v. 4, article no. 100140 How to Cite?
AbstractGenerative Large Language Models (LLMs) demonstrate impressive results in different writing tasks and have already attracted much attention from researchers and practitioners. However, there is limited research to investigate the capability of generative LLMs for reflective writing. To this end, in the present study, we have extensively reviewed the existing literature and selected 9 representative prompting strategies for ChatGPT – the chatbot based on state-of-art generative LLMs to generate a diverse set of reflective responses, which are combined with student-written reflections. Next, those responses were evaluated by experienced teaching staff following a theory-aligned assessment rubric that was designed to evaluate student-generated reflections in several university-level pharmacy courses. Furthermore, we explored the extent to which Deep Learning classification methods can be utilised to automatically differentiate between reflective responses written by students vs. reflective responses generated by ChatGPT. To this end, we harnessed BERT, a state-of-art Deep Learning classifier, and compared the performance of this classifier to the performance of human evaluators and the AI content detector by OpenAI. Following our extensive experimentation, we found that (i) ChatGPT may be capable of generating high-quality reflective responses in writing assignments administered across different pharmacy courses, (ii) the quality of automatically generated reflective responses was higher in all six assessment criteria than the quality of student-written reflections; and (iii) a domain-specific BERT-based classifier could effectively differentiate between student-written and ChatGPT-generated reflections, greatly surpassing (up to 38% higher across four accuracy metrics) the classification performed by experienced teaching staff and general-domain classifier, even in cases where the testing prompts were not known at the time of model training.
Persistent Identifierhttp://hdl.handle.net/10722/354276

 

DC FieldValueLanguage
dc.contributor.authorLi, Yuheng-
dc.contributor.authorSha, Lele-
dc.contributor.authorYan, Lixiang-
dc.contributor.authorLin, Jionghao-
dc.contributor.authorRaković, Mladen-
dc.contributor.authorGalbraith, Kirsten-
dc.contributor.authorLyons, Kayley-
dc.contributor.authorGašević, Dragan-
dc.contributor.authorChen, Guanliang-
dc.date.accessioned2025-02-07T08:47:36Z-
dc.date.available2025-02-07T08:47:36Z-
dc.date.issued2023-
dc.identifier.citationComputers and Education: Artificial Intelligence, 2023, v. 4, article no. 100140-
dc.identifier.urihttp://hdl.handle.net/10722/354276-
dc.description.abstractGenerative Large Language Models (LLMs) demonstrate impressive results in different writing tasks and have already attracted much attention from researchers and practitioners. However, there is limited research to investigate the capability of generative LLMs for reflective writing. To this end, in the present study, we have extensively reviewed the existing literature and selected 9 representative prompting strategies for ChatGPT – the chatbot based on state-of-art generative LLMs to generate a diverse set of reflective responses, which are combined with student-written reflections. Next, those responses were evaluated by experienced teaching staff following a theory-aligned assessment rubric that was designed to evaluate student-generated reflections in several university-level pharmacy courses. Furthermore, we explored the extent to which Deep Learning classification methods can be utilised to automatically differentiate between reflective responses written by students vs. reflective responses generated by ChatGPT. To this end, we harnessed BERT, a state-of-art Deep Learning classifier, and compared the performance of this classifier to the performance of human evaluators and the AI content detector by OpenAI. Following our extensive experimentation, we found that (i) ChatGPT may be capable of generating high-quality reflective responses in writing assignments administered across different pharmacy courses, (ii) the quality of automatically generated reflective responses was higher in all six assessment criteria than the quality of student-written reflections; and (iii) a domain-specific BERT-based classifier could effectively differentiate between student-written and ChatGPT-generated reflections, greatly surpassing (up to 38% higher across four accuracy metrics) the classification performed by experienced teaching staff and general-domain classifier, even in cases where the testing prompts were not known at the time of model training.-
dc.languageeng-
dc.relation.ispartofComputers and Education: Artificial Intelligence-
dc.subjectChatGPT-
dc.subjectGenerative language model-
dc.subjectNatural language processing-
dc.subjectReflective writing-
dc.titleCan large language models write reflectively-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1016/j.caeai.2023.100140-
dc.identifier.scopuseid_2-s2.0-85159351677-
dc.identifier.volume4-
dc.identifier.spagearticle no. 100140-
dc.identifier.epagearticle no. 100140-
dc.identifier.eissn2666-920X-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats