Просмотр новости

Найдите то, что Вас интересует

Adaptive Phoneme State Learning Architecture for Enhanced Speech Recognition Using Backpropagation Neural Network and Hidden Markov Model [version 2; peer review: 2 approved, 1 not approved]

Дата публикации: 03-06-2026 12:16:20

Speech remains a primary mode of human communication; however, automated speech recognition (ASR) systems face challenges from accent variability, temporal fluctuations, noise, and data privacy concerns. This paper proposes an enhanced ASR architecture incorporating an Adaptive Phoneme State Learning (APSL) algorithm with a Backpropagation Neural Network (BPNN) and Hidden Markov Model (HMM). APSL dynamically adjusts HMM state probabilities using phoneme confidence scores derived from the BPNN, thereby improving phoneme transition modeling and alignment. The multi-stage ASR pipeline includes noise reduction, speech-pause detection, and feature extraction via framing and windowing. APSL’s adaptive mechanism reduces ambiguities in phoneme transitions, resulting in a more accurate speech-to-text conversion. A comparative evaluation framework assesses the baseline HMM, standalone BPNN, and integrated APSL-BPNN-HMM model. Experiments were conducted using a custom-built dataset of 2000 audio files alongside five benchmark corpora: BNC, ANC, COCA, Buckeye, and Emu. Key evaluation metrics—recall, precision, F-score, and Word Error Rate (WER)—demonstrate that the APSL-enhanced model significantly outperforms baseline systems, achieving 95.7% recall, 92.95% precision, 94.53% F-score, and 96% overall accuracy. Notably, APSL-BPNN-HMM consistently yielded the lowest WER across all datasets, validating its effectiveness. This work highlights the benefits of adaptive learning in probabilistic frameworks for achieving robust and accurate speech recognition.

Схожие новости

#Наименование новостиТональностьИнформативностьДата публикации
1Malware Detection Using RNA Encoding and Convolutional Neural Networks on the Malicious Network Dataset [version 3; peer review: 2 approved]0703-06-2026
2Нейронные аудиокодеки: мощное сжатие звука с помощью LLM0015-06-2026
3Some Results of Fermatean Fuzzy Set on Subalgebras and Ideals of Bn-Algebras [version 2; peer review: 2 approved]0707-05-2026
4Группа «Т-Технологии» выложила в открытый доступ потоковую модель распознавания речи на русском языке0022-07-2025
5В Югре нейросеть научили озвучивать мансийские фразы0020-06-2026
6Oral Health–Related Quality of Life and Patient-Reported Outcomes After Implant Rehabilitation Using CAS Kit–Assisted Indirect Maxillary Sinus Augmentation: A Longitudinal Observational Study [version 2; peer review: 2 approved, 1 approved with reservations]0725-05-2026
7Компания Canonical представила систему распознавания речи Myna0017-06-2026
8NoiseWorks Audio add Mouth De-Click to VoiceAssist0524-06-2026
9Совершенствование системы планово-предупредительных ремонтов на гидроэлектростанциях за счет использования алгоритмов диагностики фактического технического состояния оборудования0001-01-1970
10Next-generation database reduces AI hallucinations and improves accuracy by 78%0019-06-2026

Классификация: Пресс-релизы. Схожих патентов: 0. Схожих новостей: 10. Тональность: 0. Информативность: 7. Источник: f1000research.com.