Vani-Adapt: A Zero-Shot Accent Trans-Adaptation Framework for Robust Indic Speech Recognition

Vani-Adapt: A Zero-Shot Accent Trans-Adaptation Framework for Robust Indic Speech Recognition

Volume: 12 | Issue: 1 | Year 2026 | Subscription

International Journal of Image Processing and Pattern Recognition

Received Date: 12/20/2025

Acceptance Date: 01/15/2026

Published On: 2026-02-18

First Page: 17

Last Page: 22

Journal Menu

By: Jyotirmoyee Mandal, Kunal Halder, and Kakali Das.

1 Student, Department of Computer Science and Engineering, Greater Kolkata College of Engineering and Management, Sonarpur, Kolkata, West Bengal, India
2 Student, Department of Computer Science and Engineering, Greater Kolkata College of Engineering and Management, Sonarpur, Kolkata, West Bengal, India
3 Assistant Professor, Department of Computer Science and Engineering, Greater Kolkata College of Engineering and Management, Sonarpur, Kolkata, West Bengal, India

Abstract

In countries like India with multilingual and accent-rich dialects, speech-based human–computer interaction is important to expand digital services for better accessibility. Even with recent advancements in Automatic Speech Recognition (ASR), existing systems are still very reactive to regional accents and non-standard speech patterns which is not suitable for seamless experience. Traditional perspectives rely on accent-specific fine-tuning, which is unrealistic for real-world deployment and requires a lot of labeled data, which is almost impossible. In this paper, we demonstrate Vani-Adapt, a zero-shot accent trans-adaptation framework that figures out ASR robustness for accents that have never been seen before without the need for retraining or accent-labeled data with accuracy. A Disentangled Phonetic–Prosodic Encoder (DPPE), which tells apart linguistic content from prosodic features like intonation, rhythm, and stress, is the foundation of the proposed method. Vani-Adapt allows for structured accent normalization while keeping up speaker identity and semantic content by forecasting speech into an accent-invariant phonetic space and independently changing prosodic representations. A high-fidelity neural vocoder is utilized to reintegrate the modified speech, empowering smooth combination with existing ASR backends. Distinguishing OpenAI Whisper to outperforming baselines, trials show notable finetuning, such as a 28% comparative drop in Word Error Rate (WER) on hidden accents. Upgrades in naturalness and accessibility are further confirmed by subjective listening assessments. The outcomes reveal that Vani-Adapt provides an expandable and data-efficient accent-agnostic speech recognition solution, which makes it especially suitable for comprehensive conversational AI systems applied in linguistically diverse settings.

Citation:

How to cite this article: Jyotirmoyee Mandal, Kunal Halder, and Kakali Das Vani-Adapt: A Zero-Shot Accent Trans-Adaptation Framework for Robust Indic Speech Recognition. International Journal of Image Processing and Pattern Recognition. 2026; 12(1): 17-22p.

How to cite this URL: Jyotirmoyee Mandal, Kunal Halder, and Kakali Das, Vani-Adapt: A Zero-Shot Accent Trans-Adaptation Framework for Robust Indic Speech Recognition. International Journal of Image Processing and Pattern Recognition. 2026; 12(1): 17-22p. Available from:https://journalspub.com/publication/ijippr/article=26330

Refrences:

Radford A, Kim JW, Xu T, Brockman G, McLeavey C, Sutskever I. Robust speech recognition via large-scale weak supervision. In: Proceedings of the International Conference on Machine Learning; 2023 Jul 3. p. 28492–28518.
Prabhavalkar R, Hori T, Sainath TN, Schlüter R, Watanabe S. End-to-end speech recognition: A survey. IEEE/ACM Trans Audio Speech Lang Process. 2023 Oct 30;32:325–351.
Ghoshal A, Swietojanski P, Renals S. Multilingual training of deep neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing; 2013 May 26. p. 7319–7323.
Schultz T, Waibel A. Multilingual and crosslingual speech recognition. In: Proceedings of the DARPA Workshop on Broadcast News Transcription and Understanding; 1998 Feb. p. 259–262.
Lee CH, Wang SM, Chang HC, Lee HY. ODSQA: Open-domain spoken question answering dataset. In: 2018 IEEE Spoken Language Technology Workshop (SLT); 2018 Dec 18. p. 949–956.
Pascual S, Ravanelli M, Serra J, Bonafonte A, Bengio Y. Learning problem-agnostic speech representations from multiple self-supervised tasks. arXiv preprint arXiv:1904.03416. 2019 Apr 6.
Zen H, Dang V, Clark R, Zhang Y, Weiss RJ, Jia Y, et al. LibriTTS: A corpus derived from LibriSpeech for text-to-speech. arXiv preprint arXiv:1904.02882. 2019 Apr 5.
Tjandra A, Sisman B, Zhang M, Sakti S, Li H, Nakamura S. VQVAE unsupervised unit discovery and multi-scale code2spec inverter for ZeroSpeech challenge 2019. arXiv preprint arXiv:1905.11449. 2019 May 27.
Donahue J, Dieleman S, Bińkowski M, Elsen E, Simonyan K. End-to-end adversarial text-to-speech. arXiv preprint arXiv:2006.03575. 2020 Jun 5.
Snyder D, Garcia-Romero D, Sell G, Povey D, Khudanpur S. X-vectors: Robust DNN embeddings for speaker recognition. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2018 Apr 15. p. 5329–5333.
Dutoit T. High-quality text-to-speech synthesis: An overview. J Electr Electron Eng Aust. 1997 Mar;17(1):25–36.

Not A Member?

Initiatives

Guidelines For

Track your Manuscript

Help

Contact Us

Corporate Office

JournalsPub
An Imprint of Dhruv Infosystems Pvt Ltd
A-118, 1st Floor, Sector-63, Noida, U.P. India,
Pin-201301
(Tel) (+91) 0120- 4781 200
(Mob) (+91) 9810078958, +919667725932
E-mail: [email protected]

Disclaimer

Password Reset

The instructions to reset your password are sent to the email address you provided. If you did not receive the email, please check your spam folder as well

WEBSITE DISCLAIMER

Last updated: 2025-07-04

The information provided by Journalspub (“Company”, “we”, “our”, “us”) on https://journalspub.com/ (the “Site”) is for general informational purposes only. All information on the Site is provided in good faith, however, we make no representation or warranty of any kind, express or implied, regarding the accuracy, adequacy, validity, reliability, availability, or completeness of any information on the Site.

UNDER NO CIRCUMSTANCE SHALL WE HAVE ANY LIABILITY TO YOU FOR ANY LOSS OR DAMAGE OF ANY KIND INCURRED AS A RESULT OF THE USE OF THE SITE OR RELIANCE ON ANY INFORMATION PROVIDED ON THE SITE. YOUR USE OF THE SITE AND YOUR RELIANCE ON ANY INFORMATION ON THE SITE IS SOLELY AT YOUR OWN RISK.

EXTERNAL LINKS DISCLAIMER

The Site may contain (or you may be sent through the Site) links to other websites or content belonging to or originating from third parties or links to websites and features. Such external links are not investigated, monitored, or checked for accuracy, adequacy, validity, reliability, availability, or completeness by us.

WE DO NOT WARRANT, ENDORSE, GUARANTEE, OR ASSUME RESPONSIBILITY FOR THE ACCURACY OR RELIABILITY OF ANY INFORMATION OFFERED BY THIRD-PARTY WEBSITES LINKED THROUGH THE SITE OR ANY WEBSITE OR FEATURE LINKED IN ANY BANNER OR OTHER ADVERTISING. WE WILL NOT BE A PARTY TO OR IN ANY WAY BE RESPONSIBLE FOR MONITORING ANY TRANSACTION BETWEEN YOU AND THIRD-PARTY PROVIDERS OF PRODUCTS OR SERVICES.

PROFESSIONAL DISCLAIMER

The Site can not and does not contain medical advice. The information is provided for general informational and educational purposes only and is not a substitute for professional medical advice. Accordingly, before taking any actions based on such information, we encourage you to consult with the appropriate professionals. We do not provide any kind of medical advice.

Content published on https://journalspub.com/ is intended to be used and must be used for informational purposes only. It is very important to do your analysis before making any decision based on your circumstances. You should take independent medical advice from a professional or independently research and verify any information that you find on our Website and wish to rely upon.

THE USE OR RELIANCE OF ANY INFORMATION CONTAINED ON THIS SITE IS SOLELY AT YOUR OWN RISK.

AFFILIATES DISCLAIMER

The Site may contain links to affiliate websites, and we may receive an affiliate commission for any purchases or actions made by you on the affiliate websites using such links.

TESTIMONIALS DISCLAIMER

The Site may contain testimonials by users of our products and/or services. These testimonials reflect the real-life experiences and opinions of such users. However, the experiences are personal to those particular users, and may not necessarily be representative of all users of our products and/or services. We do not claim, and you should not assume that all users will have the same experiences.

YOUR RESULTS MAY VARY.

The testimonials on the Site are submitted in various forms such as text, audio, and/or video, and are reviewed by us before being posted. They appear on the Site verbatim as given by the users, except for the correction of grammar or typing errors. Some testimonials may have been shortened for the sake of brevity, where the full testimonial contained extraneous information not relevant to the general public.

The views and opinions contained in the testimonials belong solely to the individual user and do not reflect our views and opinions.

ERRORS AND OMISSIONS DISCLAIMER

While we have made every attempt to ensure that the information contained in this site has been obtained from reliable sources, Journalspub is not responsible for any errors or omissions or the results obtained from the use of this information. All information on this site is provided “as is”, with no guarantee of completeness, accuracy, timeliness, or of the results obtained from the use of this information, and without warranty of any kind, express or implied, including, but not limited to warranties of performance, merchantability, and fitness for a particular purpose.

In no event will Journalpub, its related partnerships or corporations, or the partners, agents, or employees thereof be liable to you or anyone else for any decision made or action taken in reliance on the information in this Site or for any consequential, special or similar damages, even if advised of the possibility of such damages.

GUEST CONTRIBUTORS DISCLAIMER

This Site may include content from guest contributors and any views or opinions expressed in such posts are personal and do not represent those of Journalspub or any of its staff or affiliates unless explicitly stated.

LOGOS AND TRADEMARKS DISCLAIMER

All logos and trademarks of third parties referenced on journalspub.com are the trademarks and logos of their respective owners. Any inclusion of such trademarks or logos does not imply or constitute any approval, endorsement, or sponsorship of STM Journals by such owners.

Use of AI Tools in Blog Content

Some of the blog posts published on this website are created with the assistance of Artificial Intelligence (AI) tools. While efforts are made to review and edit the content for accuracy and appropriateness, there may still be instances where unintended, unnecessary, or unverified information or claims appear.Readers are advised to use their discretion while interpreting the content. The primary purpose of using AI-generated content is to provide our audience with the most recent, diverse, and wide-ranging information on various topics. The content is intended to inform and engage, not to mislead.All external links included in the blogs are intended to guide users to real and authentic workshops, programs, or resources. The information presented through those links is curated and verified to the best of our knowledge.This disclaimer is meant to inform visitors about the use of AI in content creation, acknowledge potential limitations in content accuracy, and encourage informed and responsible reading.

CONTACT US

Should you have any feedback, comments, requests for technical support, or other inquiries, please contact us by email: [email protected].