File Download
Supplementary
-
Citations:
- Appears in Collections:
Article: Comparative Analysis of Chatbot Systems
| Title | Comparative Analysis of Chatbot Systems |
|---|---|
| Authors | |
| Issue Date | 30-Sep-2025 |
| Publisher | IOS Press |
| Citation | Frontiers in Artificial Intelligence and Applications, 2025, v. 412 How to Cite? |
| Abstract | Existing research on chatbot evaluation suffers from inconsistent assessment standards, fragmented criteria, and insufficient coverage of critical dimensions like legal compliance and ethical alignment, which hinders reliable benchmarking of chatbots’ performance. Our study proposes a comprehensive framework for such evaluation and systematically compares five chatbot systems: Tidio (Rule-Based), GPT-4o (AI-Powered), Claude 3.5 Sonnet (LLM), Watson Assistant (Enterprise), and Qwen2.5-Max (Multilingual) in terms of their accuracy, safety, legal compliance, generalizability of performance, and ethical alignment. We conclude that while chatbots enhance efficiency in healthcare (97.34% patient education completeness) and e-commerce (30%–40% cost reduction), critical limitations persist. Recommendations include: (1) retrieval-augmented generation (RAG) for hallucination reduction, (2) ethical governance frameworks (e.g., AILuminate), and (3) domain-specialized tuning. Cross-sector collaboration and standardized evaluations are essential for responsible deployment of AI. |
| Persistent Identifier | http://hdl.handle.net/10722/365937 |
| ISSN | 2023 SCImago Journal Rankings: 0.281 |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Xu, Hengsheng | - |
| dc.contributor.author | Wan, Linkun | - |
| dc.contributor.author | Li, Yunyin | - |
| dc.contributor.author | Liu, Jiaxi | - |
| dc.contributor.author | Lau, Adela S.M. | - |
| dc.date.accessioned | 2025-11-12T00:36:38Z | - |
| dc.date.available | 2025-11-12T00:36:38Z | - |
| dc.date.issued | 2025-09-30 | - |
| dc.identifier.citation | Frontiers in Artificial Intelligence and Applications, 2025, v. 412 | - |
| dc.identifier.issn | 0922-6389 | - |
| dc.identifier.uri | http://hdl.handle.net/10722/365937 | - |
| dc.description.abstract | <p>Existing research on chatbot evaluation suffers from inconsistent assessment standards, fragmented criteria, and insufficient coverage of critical dimensions like legal compliance and ethical alignment, which hinders reliable benchmarking of chatbots’ performance. Our study proposes a comprehensive framework for such evaluation and systematically compares five chatbot systems: Tidio (Rule-Based), GPT-4o (AI-Powered), Claude 3.5 Sonnet (LLM), Watson Assistant (Enterprise), and Qwen2.5-Max (Multilingual) in terms of their accuracy, safety, legal compliance, generalizability of performance, and ethical alignment. We conclude that while chatbots enhance efficiency in healthcare (97.34% patient education completeness) and e-commerce (30%–40% cost reduction), critical limitations persist. Recommendations include: (1) retrieval-augmented generation (RAG) for hallucination reduction, (2) ethical governance frameworks (e.g., AILuminate), and (3) domain-specialized tuning. Cross-sector collaboration and standardized evaluations are essential for responsible deployment of AI.</p> | - |
| dc.language | eng | - |
| dc.publisher | IOS Press | - |
| dc.relation.ispartof | Frontiers in Artificial Intelligence and Applications | - |
| dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
| dc.title | Comparative Analysis of Chatbot Systems | - |
| dc.type | Article | - |
| dc.description.nature | published_or_final_version | - |
| dc.identifier.doi | 10.3233/FAIA250737 | - |
| dc.identifier.volume | 412 | - |
| dc.identifier.eissn | 1535-6698 | - |
| dc.identifier.issnl | 0922-6389 | - |

