Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments

<p dir="ltr">Automatic speech recognition (ASR) and speech enhancement are essential tools in modern life, aiding not only in machine interaction but also in supporting individuals with hearing impairments. These processes begin with capturing speech in analog form and applying signa...

Full description

Saved in:
Bibliographic Details
Main Author: Billel Essaid (22047578) (author)
Other Authors: Hamza Kheddar (17337712) (author), Noureddine Batel (22047581) (author), Muhammad E. H. Chowdhury (14150526) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1864513537992294400
author Billel Essaid (22047578)
author2 Hamza Kheddar (17337712)
Noureddine Batel (22047581)
Muhammad E. H. Chowdhury (14150526)
author2_role author
author
author
author_facet Billel Essaid (22047578)
Hamza Kheddar (17337712)
Noureddine Batel (22047581)
Muhammad E. H. Chowdhury (14150526)
author_role author
dc.creator.none.fl_str_mv Billel Essaid (22047578)
Hamza Kheddar (17337712)
Noureddine Batel (22047581)
Muhammad E. H. Chowdhury (14150526)
dc.date.none.fl_str_mv 2025-02-28T18:00:00Z
dc.identifier.none.fl_str_mv 10.1109/access.2025.3542953
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Deep_Learning-Based_Coding_Strategy_for_Improved_Cochlear_Implant_Speech_Perception_in_Noisy_Environments/30234532
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Engineering
Biomedical engineering
Electrical engineering
Cochlear implant
deep learning
sound coding strategy
speech enhancement
transformer
Speech enhancement
Noise measurement
Noise reduction
Noise
Convolutional neural networks
Autoencoders
Biological system modeling
Training
Real-time systems
Feature extraction
dc.title.none.fl_str_mv Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p dir="ltr">Automatic speech recognition (ASR) and speech enhancement are essential tools in modern life, aiding not only in machine interaction but also in supporting individuals with hearing impairments. These processes begin with capturing speech in analog form and applying signal processing algorithms to ensure compatibility with devices like cochlear implants (CIs). However, CIs, with their limited number of electrodes, often cause speech distortion, and despite advancements in state-of-the-art signal processing techniques, challenges persist, particularly in noisy environments with multiple speech sources. The rise of artificial intelligence (AI) has introduced innovative strategies to address these limitations. This paper presents a novel deep learning (DL)-based technique that leverages attention mechanisms to improve speech intelligibility through noise suppression. The proposed approach includes two strategies: the first integrates temporal convolutional networks (TCNs) and multi-head attention (MHA) layers to capture both local and global dependencies within the speech signal, enabling precise noise filtering and improved clarity. The second strategy builds on this framework by additionally incorporating bidirectional gated recurrent units (Bi-GRU) alongside TCN and MHA layers, further refining sequence modeling and enhancing noise reduction. The optimal model configuration, using TCN-MHA-Bi-GRU with a kernel size of 16, achieved a compact model size of 788K parameters and recorded training, and validation losses of 0.0350 and 0.0446, respectively. Experimental results on the TIMIT and Harvard Sentences datasets, enriched with diverse noise sources from the DEMAND database, yielded high intelligibility scores with a short-time objective intelligibility (STOI) of 0.8345, word recognition score (WRS) of 99.2636, and an near correlation coefficient (LCC) of 0.9607, underscoring the model’s capability to enhance speech perception in noisy CI environments, ensuring a balance between model size and speech quality, and surpassing the existing state-of-the-art techniques.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2025.3542953" target="_blank">https://dx.doi.org/10.1109/access.2025.3542953</a></p>
eu_rights_str_mv openAccess
id Manara2_161ef7e655461c8da44304703f1eff66
identifier_str_mv 10.1109/access.2025.3542953
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/30234532
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy EnvironmentsBillel Essaid (22047578)Hamza Kheddar (17337712)Noureddine Batel (22047581)Muhammad E. H. Chowdhury (14150526)EngineeringBiomedical engineeringElectrical engineeringCochlear implantdeep learningsound coding strategyspeech enhancementtransformerSpeech enhancementNoise measurementNoise reductionNoiseConvolutional neural networksAutoencodersBiological system modelingTrainingReal-time systemsFeature extraction<p dir="ltr">Automatic speech recognition (ASR) and speech enhancement are essential tools in modern life, aiding not only in machine interaction but also in supporting individuals with hearing impairments. These processes begin with capturing speech in analog form and applying signal processing algorithms to ensure compatibility with devices like cochlear implants (CIs). However, CIs, with their limited number of electrodes, often cause speech distortion, and despite advancements in state-of-the-art signal processing techniques, challenges persist, particularly in noisy environments with multiple speech sources. The rise of artificial intelligence (AI) has introduced innovative strategies to address these limitations. This paper presents a novel deep learning (DL)-based technique that leverages attention mechanisms to improve speech intelligibility through noise suppression. The proposed approach includes two strategies: the first integrates temporal convolutional networks (TCNs) and multi-head attention (MHA) layers to capture both local and global dependencies within the speech signal, enabling precise noise filtering and improved clarity. The second strategy builds on this framework by additionally incorporating bidirectional gated recurrent units (Bi-GRU) alongside TCN and MHA layers, further refining sequence modeling and enhancing noise reduction. The optimal model configuration, using TCN-MHA-Bi-GRU with a kernel size of 16, achieved a compact model size of 788K parameters and recorded training, and validation losses of 0.0350 and 0.0446, respectively. Experimental results on the TIMIT and Harvard Sentences datasets, enriched with diverse noise sources from the DEMAND database, yielded high intelligibility scores with a short-time objective intelligibility (STOI) of 0.8345, word recognition score (WRS) of 99.2636, and an near correlation coefficient (LCC) of 0.9607, underscoring the model’s capability to enhance speech perception in noisy CI environments, ensuring a balance between model size and speech quality, and surpassing the existing state-of-the-art techniques.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2025.3542953" target="_blank">https://dx.doi.org/10.1109/access.2025.3542953</a></p>2025-02-28T18:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1109/access.2025.3542953https://figshare.com/articles/journal_contribution/Deep_Learning-Based_Coding_Strategy_for_Improved_Cochlear_Implant_Speech_Perception_in_Noisy_Environments/30234532CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/302345322025-02-28T18:00:00Z
spellingShingle Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments
Billel Essaid (22047578)
Engineering
Biomedical engineering
Electrical engineering
Cochlear implant
deep learning
sound coding strategy
speech enhancement
transformer
Speech enhancement
Noise measurement
Noise reduction
Noise
Convolutional neural networks
Autoencoders
Biological system modeling
Training
Real-time systems
Feature extraction
status_str publishedVersion
title Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments
title_full Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments
title_fullStr Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments
title_full_unstemmed Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments
title_short Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments
title_sort Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments
topic Engineering
Biomedical engineering
Electrical engineering
Cochlear implant
deep learning
sound coding strategy
speech enhancement
transformer
Speech enhancement
Noise measurement
Noise reduction
Noise
Convolutional neural networks
Autoencoders
Biological system modeling
Training
Real-time systems
Feature extraction