Decoding silent speech: a machine learning perspective on data, methods, and frameworks

<p dir="ltr">At the nexus of signal processing and machine learning (ML), silent speech recognition (SSR) has evolved as a game-changing technology that allows for communication without audible voice. This study offers a thorough overview of SSR, tracing its evolution from early wave...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Adiba Tabassum Chowdhury (19444792) (author)
مؤلفون آخرون: Mehrin Newaz (21976595) (author), Purnata Saha (17823467) (author), Mohannad Natheef AbuHaweeleh (21842282) (author), Sara Mohsen (18192739) (author), Diala Bushnaq (22330603) (author), Malek Chabbouh (22330606) (author), Raghad Aljindi (22330609) (author), Shona Pedersen (2792278) (author), Muhammad E. H. Chowdhury (14150526) (author)
منشور في: 2025
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513538066743296
author Adiba Tabassum Chowdhury (19444792)
author2 Mehrin Newaz (21976595)
Purnata Saha (17823467)
Mohannad Natheef AbuHaweeleh (21842282)
Sara Mohsen (18192739)
Diala Bushnaq (22330603)
Malek Chabbouh (22330606)
Raghad Aljindi (22330609)
Shona Pedersen (2792278)
Muhammad E. H. Chowdhury (14150526)
author2_role author
author
author
author
author
author
author
author
author
author_facet Adiba Tabassum Chowdhury (19444792)
Mehrin Newaz (21976595)
Purnata Saha (17823467)
Mohannad Natheef AbuHaweeleh (21842282)
Sara Mohsen (18192739)
Diala Bushnaq (22330603)
Malek Chabbouh (22330606)
Raghad Aljindi (22330609)
Shona Pedersen (2792278)
Muhammad E. H. Chowdhury (14150526)
author_role author
dc.creator.none.fl_str_mv Adiba Tabassum Chowdhury (19444792)
Mehrin Newaz (21976595)
Purnata Saha (17823467)
Mohannad Natheef AbuHaweeleh (21842282)
Sara Mohsen (18192739)
Diala Bushnaq (22330603)
Malek Chabbouh (22330606)
Raghad Aljindi (22330609)
Shona Pedersen (2792278)
Muhammad E. H. Chowdhury (14150526)
dc.date.none.fl_str_mv 2025-02-20T09:00:00Z
dc.identifier.none.fl_str_mv 10.1007/s00521-024-10456-z
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Decoding_silent_speech_a_machine_learning_perspective_on_data_methods_and_frameworks/30234088
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Information and computing sciences
Artificial intelligence
Human-centred computing
Machine learning
Speech recognition
Silent speech
Machine learning
Deep learning
Speech decoding
Waves to words
dc.title.none.fl_str_mv Decoding silent speech: a machine learning perspective on data, methods, and frameworks
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p dir="ltr">At the nexus of signal processing and machine learning (ML), silent speech recognition (SSR) has evolved as a game-changing technology that allows for communication without audible voice. This study offers a thorough overview of SSR, tracing its evolution from early waveform analysis to the most recent ML methods. We start by examining current SSR techniques using ML and determining the essential conditions for efficient SSR systems. After that, we look at the datasets and data collection techniques currently employed in SSR research, highlighting the difficulties posed by the variety of articulatory movements and the scarcity of data. Examining state-of-the-art SSR frameworks, the paper covers important topics such signal processing, feature extraction, ML techniques for decoding and optimizing and assessing the performance of SSR models. We emphasize how deep learning (DL) and ML models have evolved to increase SSR resilience and accuracy. The field's proposed procedures are examined, with an emphasis on sophisticated feature extraction and classification methods. Modern SSR techniques are compared in terms of performance, highlighting the advantages and disadvantages of different models. There is also discussion of ethical issues, especially those pertaining to privacy and consent. The integration of multimodal information—visual cues, electromyography signals, and neuroimaging data—to improve SSR systems is covered in this work. We investigate the functions of transfer learning and domain adaptation in handling cross-subject variability. Lastly, the study offers suggestions and future prospects for SSR research, providing practitioners, engineers, and academics with a road map. As SSR continues to push the frontiers of human–machine interaction, our study aims to increase our collective understanding of the technological advances and societal effects of SSR in the ML age.</p><h2>Other Information</h2><p dir="ltr">Published in: Neural Computing and Applications<br>License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1007/s00521-024-10456-z" target="_blank">https://dx.doi.org/10.1007/s00521-024-10456-z</a></p>
eu_rights_str_mv openAccess
id Manara2_604e6834fc52dae2cefd6aef0f76bd97
identifier_str_mv 10.1007/s00521-024-10456-z
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/30234088
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Decoding silent speech: a machine learning perspective on data, methods, and frameworksAdiba Tabassum Chowdhury (19444792)Mehrin Newaz (21976595)Purnata Saha (17823467)Mohannad Natheef AbuHaweeleh (21842282)Sara Mohsen (18192739)Diala Bushnaq (22330603)Malek Chabbouh (22330606)Raghad Aljindi (22330609)Shona Pedersen (2792278)Muhammad E. H. Chowdhury (14150526)Information and computing sciencesArtificial intelligenceHuman-centred computingMachine learningSpeech recognitionSilent speechMachine learningDeep learningSpeech decodingWaves to words<p dir="ltr">At the nexus of signal processing and machine learning (ML), silent speech recognition (SSR) has evolved as a game-changing technology that allows for communication without audible voice. This study offers a thorough overview of SSR, tracing its evolution from early waveform analysis to the most recent ML methods. We start by examining current SSR techniques using ML and determining the essential conditions for efficient SSR systems. After that, we look at the datasets and data collection techniques currently employed in SSR research, highlighting the difficulties posed by the variety of articulatory movements and the scarcity of data. Examining state-of-the-art SSR frameworks, the paper covers important topics such signal processing, feature extraction, ML techniques for decoding and optimizing and assessing the performance of SSR models. We emphasize how deep learning (DL) and ML models have evolved to increase SSR resilience and accuracy. The field's proposed procedures are examined, with an emphasis on sophisticated feature extraction and classification methods. Modern SSR techniques are compared in terms of performance, highlighting the advantages and disadvantages of different models. There is also discussion of ethical issues, especially those pertaining to privacy and consent. The integration of multimodal information—visual cues, electromyography signals, and neuroimaging data—to improve SSR systems is covered in this work. We investigate the functions of transfer learning and domain adaptation in handling cross-subject variability. Lastly, the study offers suggestions and future prospects for SSR research, providing practitioners, engineers, and academics with a road map. As SSR continues to push the frontiers of human–machine interaction, our study aims to increase our collective understanding of the technological advances and societal effects of SSR in the ML age.</p><h2>Other Information</h2><p dir="ltr">Published in: Neural Computing and Applications<br>License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1007/s00521-024-10456-z" target="_blank">https://dx.doi.org/10.1007/s00521-024-10456-z</a></p>2025-02-20T09:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1007/s00521-024-10456-zhttps://figshare.com/articles/journal_contribution/Decoding_silent_speech_a_machine_learning_perspective_on_data_methods_and_frameworks/30234088CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/302340882025-02-20T09:00:00Z
spellingShingle Decoding silent speech: a machine learning perspective on data, methods, and frameworks
Adiba Tabassum Chowdhury (19444792)
Information and computing sciences
Artificial intelligence
Human-centred computing
Machine learning
Speech recognition
Silent speech
Machine learning
Deep learning
Speech decoding
Waves to words
status_str publishedVersion
title Decoding silent speech: a machine learning perspective on data, methods, and frameworks
title_full Decoding silent speech: a machine learning perspective on data, methods, and frameworks
title_fullStr Decoding silent speech: a machine learning perspective on data, methods, and frameworks
title_full_unstemmed Decoding silent speech: a machine learning perspective on data, methods, and frameworks
title_short Decoding silent speech: a machine learning perspective on data, methods, and frameworks
title_sort Decoding silent speech: a machine learning perspective on data, methods, and frameworks
topic Information and computing sciences
Artificial intelligence
Human-centred computing
Machine learning
Speech recognition
Silent speech
Machine learning
Deep learning
Speech decoding
Waves to words