Skip to main navigation Skip to search Skip to main content

Evaluating Automatic Speech Recognition Models: How Well Do They Handle Accents?

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The development of Automatic Speech Recognition (ASR) technology has progressed remarkably, becoming an integral component of virtual assistants, transcription services, and accessibility tools. Despite these advancements, ASR systems still struggle to accurately recognize speech from individuals with different accents and linguistic features. This work analyzes the performance of various ASR models, including cloud-based, local, and integrated speech recognition systems. For evaluation, we use different accented speech datasets and assess the ASR variants using Word Error Rate (WER) as the primary metric. The datasets include the Speech Accent Archive (SAA), L2-ARCTIC, and an Indian accent dataset. The results show that ASR accuracy varies depending on the speaker’s language and accent. OpenAI Whisper, Deepgram, and AssemblyAI perform significantly better compared to conventional models like Mozilla DeepSpeech. The results indicate that many standalone ASR models are optimized for non-regional standard English, leading to higher error rates for non-native and regionally accented speech. Future developments should focus on augmenting multilingual datasets and refining algorithms to achieve more equitable speech recognition capabilities for diverse accents.

Original languageEnglish
Title of host publicationAI Revolution
Subtitle of host publicationResearch, Ethics and Society - International Conference, AIR-RES 2025, Proceedings
EditorsHamid R. Arabnia, Leonidas Deligiannidis, Soheyla Amirian, Farid Ghareh Mohammadi, Farzan Shenavarmasouleh
PublisherSpringer Science and Business Media Deutschland GmbH
Pages447-458
Number of pages12
ISBN (Print)9783032129291
DOIs
StatePublished - 2026
EventInternational conference on AI Revolution: Research, Ethics and Society, AIR-RES 2025 - Las Vegas, United States
Duration: 14 Apr 202516 Apr 2025

Publication series

NameCommunications in Computer and Information Science
Volume2722 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

ConferenceInternational conference on AI Revolution: Research, Ethics and Society, AIR-RES 2025
Country/TerritoryUnited States
CityLas Vegas
Period14/04/2516/04/25

Keywords

  • Automatic Speech Recognition
  • Linguistic Diversity
  • Speech Recognition Models

Fingerprint

Dive into the research topics of 'Evaluating Automatic Speech Recognition Models: How Well Do They Handle Accents?'. Together they form a unique fingerprint.

Cite this