Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments
<p dir="ltr">Automatic speech recognition (ASR) and speech enhancement are essential tools in modern life, aiding not only in machine interaction but also in supporting individuals with hearing impairments. These processes begin with capturing speech in analog form and applying signa...
Saved in:
| Main Author: | |
|---|---|
| Other Authors: | , , |
| Published: |
2025
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1864513537992294400 |
|---|---|
| author | Billel Essaid (22047578) |
| author2 | Hamza Kheddar (17337712) Noureddine Batel (22047581) Muhammad E. H. Chowdhury (14150526) |
| author2_role | author author author |
| author_facet | Billel Essaid (22047578) Hamza Kheddar (17337712) Noureddine Batel (22047581) Muhammad E. H. Chowdhury (14150526) |
| author_role | author |
| dc.creator.none.fl_str_mv | Billel Essaid (22047578) Hamza Kheddar (17337712) Noureddine Batel (22047581) Muhammad E. H. Chowdhury (14150526) |
| dc.date.none.fl_str_mv | 2025-02-28T18:00:00Z |
| dc.identifier.none.fl_str_mv | 10.1109/access.2025.3542953 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/journal_contribution/Deep_Learning-Based_Coding_Strategy_for_Improved_Cochlear_Implant_Speech_Perception_in_Noisy_Environments/30234532 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Engineering Biomedical engineering Electrical engineering Cochlear implant deep learning sound coding strategy speech enhancement transformer Speech enhancement Noise measurement Noise reduction Noise Convolutional neural networks Autoencoders Biological system modeling Training Real-time systems Feature extraction |
| dc.title.none.fl_str_mv | Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments |
| dc.type.none.fl_str_mv | Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal |
| description | <p dir="ltr">Automatic speech recognition (ASR) and speech enhancement are essential tools in modern life, aiding not only in machine interaction but also in supporting individuals with hearing impairments. These processes begin with capturing speech in analog form and applying signal processing algorithms to ensure compatibility with devices like cochlear implants (CIs). However, CIs, with their limited number of electrodes, often cause speech distortion, and despite advancements in state-of-the-art signal processing techniques, challenges persist, particularly in noisy environments with multiple speech sources. The rise of artificial intelligence (AI) has introduced innovative strategies to address these limitations. This paper presents a novel deep learning (DL)-based technique that leverages attention mechanisms to improve speech intelligibility through noise suppression. The proposed approach includes two strategies: the first integrates temporal convolutional networks (TCNs) and multi-head attention (MHA) layers to capture both local and global dependencies within the speech signal, enabling precise noise filtering and improved clarity. The second strategy builds on this framework by additionally incorporating bidirectional gated recurrent units (Bi-GRU) alongside TCN and MHA layers, further refining sequence modeling and enhancing noise reduction. The optimal model configuration, using TCN-MHA-Bi-GRU with a kernel size of 16, achieved a compact model size of 788K parameters and recorded training, and validation losses of 0.0350 and 0.0446, respectively. Experimental results on the TIMIT and Harvard Sentences datasets, enriched with diverse noise sources from the DEMAND database, yielded high intelligibility scores with a short-time objective intelligibility (STOI) of 0.8345, word recognition score (WRS) of 99.2636, and an near correlation coefficient (LCC) of 0.9607, underscoring the model’s capability to enhance speech perception in noisy CI environments, ensuring a balance between model size and speech quality, and surpassing the existing state-of-the-art techniques.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2025.3542953" target="_blank">https://dx.doi.org/10.1109/access.2025.3542953</a></p> |
| eu_rights_str_mv | openAccess |
| id | Manara2_161ef7e655461c8da44304703f1eff66 |
| identifier_str_mv | 10.1109/access.2025.3542953 |
| network_acronym_str | Manara2 |
| network_name_str | Manara2 |
| oai_identifier_str | oai:figshare.com:article/30234532 |
| publishDate | 2025 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy EnvironmentsBillel Essaid (22047578)Hamza Kheddar (17337712)Noureddine Batel (22047581)Muhammad E. H. Chowdhury (14150526)EngineeringBiomedical engineeringElectrical engineeringCochlear implantdeep learningsound coding strategyspeech enhancementtransformerSpeech enhancementNoise measurementNoise reductionNoiseConvolutional neural networksAutoencodersBiological system modelingTrainingReal-time systemsFeature extraction<p dir="ltr">Automatic speech recognition (ASR) and speech enhancement are essential tools in modern life, aiding not only in machine interaction but also in supporting individuals with hearing impairments. These processes begin with capturing speech in analog form and applying signal processing algorithms to ensure compatibility with devices like cochlear implants (CIs). However, CIs, with their limited number of electrodes, often cause speech distortion, and despite advancements in state-of-the-art signal processing techniques, challenges persist, particularly in noisy environments with multiple speech sources. The rise of artificial intelligence (AI) has introduced innovative strategies to address these limitations. This paper presents a novel deep learning (DL)-based technique that leverages attention mechanisms to improve speech intelligibility through noise suppression. The proposed approach includes two strategies: the first integrates temporal convolutional networks (TCNs) and multi-head attention (MHA) layers to capture both local and global dependencies within the speech signal, enabling precise noise filtering and improved clarity. The second strategy builds on this framework by additionally incorporating bidirectional gated recurrent units (Bi-GRU) alongside TCN and MHA layers, further refining sequence modeling and enhancing noise reduction. The optimal model configuration, using TCN-MHA-Bi-GRU with a kernel size of 16, achieved a compact model size of 788K parameters and recorded training, and validation losses of 0.0350 and 0.0446, respectively. Experimental results on the TIMIT and Harvard Sentences datasets, enriched with diverse noise sources from the DEMAND database, yielded high intelligibility scores with a short-time objective intelligibility (STOI) of 0.8345, word recognition score (WRS) of 99.2636, and an near correlation coefficient (LCC) of 0.9607, underscoring the model’s capability to enhance speech perception in noisy CI environments, ensuring a balance between model size and speech quality, and surpassing the existing state-of-the-art techniques.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2025.3542953" target="_blank">https://dx.doi.org/10.1109/access.2025.3542953</a></p>2025-02-28T18:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1109/access.2025.3542953https://figshare.com/articles/journal_contribution/Deep_Learning-Based_Coding_Strategy_for_Improved_Cochlear_Implant_Speech_Perception_in_Noisy_Environments/30234532CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/302345322025-02-28T18:00:00Z |
| spellingShingle | Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments Billel Essaid (22047578) Engineering Biomedical engineering Electrical engineering Cochlear implant deep learning sound coding strategy speech enhancement transformer Speech enhancement Noise measurement Noise reduction Noise Convolutional neural networks Autoencoders Biological system modeling Training Real-time systems Feature extraction |
| status_str | publishedVersion |
| title | Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments |
| title_full | Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments |
| title_fullStr | Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments |
| title_full_unstemmed | Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments |
| title_short | Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments |
| title_sort | Deep Learning-Based Coding Strategy for Improved Cochlear Implant Speech Perception in Noisy Environments |
| topic | Engineering Biomedical engineering Electrical engineering Cochlear implant deep learning sound coding strategy speech enhancement transformer Speech enhancement Noise measurement Noise reduction Noise Convolutional neural networks Autoencoders Biological system modeling Training Real-time systems Feature extraction |