Bangla TTS Performance Evaluation: A Benchmark Study on Synthesized Speech Quality and Intelligibility: A Benchmark Study on Synthesized Speech Quality and Intelligibility

Mehadi Hasan; Dipto Shaha; Md Rezaul Karim

doi:10.3329/dujs.v74i1.84122

Authors

Mehadi Hasan Department of Computer Science and Engineering, Faculty of Engineering and Technology University of Dhaka, Dhaka, Bangladesh
Dipto Shaha Department of Computer Science and Engineering, Faculty of Engineering and Technology University of Dhaka, Dhaka, Bangladesh
Md Rezaul Karim Department of Computer Science and Engineering, Faculty of Engineering and Technology University of Dhaka, Dhaka, Bangladesh

DOI:

https://doi.org/10.3329/dujs.v74i1.84122

Keywords:

Text-to-Speech (TTS), Speech Synthesis, Benchmarking, Objective Evaluation Metrics, Subjective Evaluation Metrics

Abstract

Bangla Text-to-Speech (TTS) systems have seen significant advancements in recent years, yet comprehensive benchmarking of their performance remains limited. This study establishes a robust evaluation framework to compare different Bangla TTS models, including Tacotron21, FastSpeech22, VITS3, and Grad-TTS4. The benchmarking approach integrates both objective and subjective assessment methodologies. Objective evaluation employs signal processing metrics such as Mel Cepstral Distortion (MCD), Mel-Spectrogram Mean Squared Error (Mel-MSE), Phoneme Error Rate (PER), Word Error Rate (WER), Signal-to-Noise Ratio (SNR), and Real-Time Factor (RTF). Subjective evaluation involves human perceptual tests such as Mean Opinion Score (MOS) test with native Bangla speakers rating speech quality and intelligibility. The study’s experimental setup ensures a fair comparison by utilizing a standardized dataset, uniform computational conditions, and diverse sentence structures. Results demonstrate the relative strengths and weaknesses of various models, highlighting the need for improved phonetic accuracy and naturalness in Bangla TTS synthesis. This research provides critical insights for advancing Bangla TTS systems and aligning them with state-of-the-art English TTS models.

Dhaka Univ. J. Sci. 74(1): 10-16, 2026 (January)

Downloads

Download data is not yet available.

Abstract
45

PDF
62

Bangla TTS Performance Evaluation: A Benchmark Study on Synthesized Speech Quality and Intelligibility

A Benchmark Study on Synthesized Speech Quality and Intelligibility

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

Mehadi Hasan, Department of Computer Science and Engineering, Faculty of Engineering and Technology University of Dhaka, Dhaka, Bangladesh

Dipto Shaha, Department of Computer Science and Engineering, Faculty of Engineering and Technology University of Dhaka, Dhaka, Bangladesh

Downloads

Published

How to Cite

Issue

Section

Make a Submission

Information

Current Issue