System Usability and Design Evaluation of AI Chatbots: A Comparative Analysis of ChatGPT, Google Bard, and Bing Chat

Sumaiya Nuha Mustafina; Nusrat Kaniz Khan; Muhammad Nazrul Islam; Fatema Siddiqua Nusrat; M Akhtaruzzaman

Authors

Sumaiya Nuha Mustafina Department of Computer Science and Engineering, Ahsanullah University of Science and Technology, Dhaka, Bangladesh
Nusrat Kaniz Khan Department of Computer Science and Engineering, Military Institute of Science and Technology, Dhaka, Bangladesh
Muhammad Nazrul Islam Department of Computer Science and Engineering, Military Institute of Science and Technology, Dhaka, Bangladesh
Fatema Siddiqua Nusrat Department of Computer Science and Engineering, Military Institute of Science and Technology, Dhaka, Bangladesh
M Akhtaruzzaman Dept. of CSE, MIST, Dhaka, Bangladesh.

Keywords:

SUS, System Usability Score, HE, Heuristic Evaluation, HCI, Human Computer Interaction

Abstract

Artificial intelligence (AI) has brought significant advancements in technology while the chatbots like ChatGPT, Google Bard, and Bing Chat are some of its remarkable innovations. These chatbots are helping users with diverse backgrounds by generating ideas, providing resources, and overall knowledge management. We acknowledge that these chatbots are still in their experimental stages of use. Evaluating the usability and user experience of chatbots becomes crucial to make them more usable, accessible, and intuitive to end users around the globe. Thus, the objectives of this research are to make a comparative usability analysis of AI-generated chatbots: Google Bard, ChatGPT, and Bing Chat. To achieve these goals, firstly, the System Usability Score (SUS) through questionnaire surveys and secondly, Heuristic Evaluation (HE) through expert observation were used. Through HE, we investigated characteristics of design, user engagement, and some other specific usability lacking along with a severity score that suggests both urgent and gradual usability improvement action. As an outcome, this study found that the SUS evaluation provided a comprehensive view of user satisfaction. Google Bard and Bing Chat received lower SUS scores, while ChatGPT demonstrated comparatively better usability, with a SUS score above 70. Again, a comparative usability analysis of AI-generated chatbots (ChatGPT, Google Bard and Bing Chat) reveals that, while all these applications suffer from a notable number of usability problems, ChatGPT demonstrates better usability performance compared to Google Bard and Bing Chat.

MIJST, Vol. 13, June 2025 : 83-97

DOI: https://doi.org/10.47981/j.mijst.13(01)2025.522(83-97)

Abstract
199

PDF
139

System Usability and Design Evaluation of AI Chatbots: A Comparative Analysis of ChatGPT, Google Bard, and Bing Chat

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

How to Cite

Information