A Survey of Audio Enhancement Algorithms for Music, Speech, Bioacoustics, Biomedical, Industrial, and Environmental Sounds by Image U-Net

<p dir="ltr">The recent surge in the use of Deep Neural Networks (DNNs) has also made its mark in the field of Audio Enhancement (AE), providing much better quality than the classical methods. Although, there are dedicated audio processing DNNs, yet, many recent models of AE have uti...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Sania Gul (18272227) (author)
مؤلفون آخرون:	Muhammad Salman Khan (7202543) (author)
منشور في:	2023
الموضوعات:	Information and computing sciences Artificial intelligence Machine learning CNNs image processing deep neural networks pre-trained networks spectrogram U-Net Convolutional neural networks Time-domain analysis Speech enhancement Spectrogram Recurrent neural networks Music Image processing Artificial neural networks
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

_version_	1864513545574547456
author	Sania Gul (18272227)
author2	Muhammad Salman Khan (7202543)
author2_role	author
author_facet	Sania Gul (18272227) Muhammad Salman Khan (7202543)
author_role	author
dc.creator.none.fl_str_mv	Sania Gul (18272227) Muhammad Salman Khan (7202543)
dc.date.none.fl_str_mv	2023-12-27T15:00:00Z
dc.identifier.none.fl_str_mv	10.1109/access.2023.3344813
dc.relation.none.fl_str_mv	https://figshare.com/articles/journal_contribution/A_Survey_of_Audio_Enhancement_Algorithms_for_Music_Speech_Bioacoustics_Biomedical_Industrial_and_Environmental_Sounds_by_Image_U-Net/29445581
dc.rights.none.fl_str_mv	CC BY 4.0 info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv	Information and computing sciences Artificial intelligence Machine learning CNNs image processing deep neural networks pre-trained networks spectrogram U-Net Convolutional neural networks Time-domain analysis Speech enhancement Spectrogram Recurrent neural networks Music Image processing Artificial neural networks
dc.title.none.fl_str_mv	A Survey of Audio Enhancement Algorithms for Music, Speech, Bioacoustics, Biomedical, Industrial, and Environmental Sounds by Image U-Net
dc.type.none.fl_str_mv	Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal
description	<p dir="ltr">The recent surge in the use of Deep Neural Networks (DNNs) has also made its mark in the field of Audio Enhancement (AE), providing much better quality than the classical methods. Although, there are dedicated audio processing DNNs, yet, many recent models of AE have utilized U-Net: a DNN based on Convolutional Neural Network (CNN), fundamentally developed for image segmentation. It is found that the useful features hidden in the time domain are highlighted when the audio signal is converted to a spectrogram, which can be treated as an image. In this article, we will review the recent work, utilizing U-Nets for different AE applications. Different than other published reviews, this review focuses entirely on AE techniques based on image U-Nets. We will discuss the need for AE, U-Net comparison to other DNNs, the benefits of converting the audio to 2D, input representations that are useful for different AE applications, the architecture of vanilla U-Net and the pre-trained models, variations in vanilla architecture incorporated in different E models, and the state-of-the-art AE algorithms based on U-Net in various applications. Apart from speech and music, this article discusses a wide range of audio signals e.g. environmental, biomedical, bioacoustics, and industrial sounds, not covered collectively in a single article in previously published studies. The article ends with the discussion of colored spectrograms in future AE applications.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2023.3344813" target="_blank">https://dx.doi.org/10.1109/access.2023.3344813</a></p>
eu_rights_str_mv	openAccess
id	Manara2_ac8bd2f35c16e243adf065dc34473389
identifier_str_mv	10.1109/access.2023.3344813
network_acronym_str	Manara2
network_name_str	Manara2
oai_identifier_str	oai:figshare.com:article/29445581
publishDate	2023
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv	CC BY 4.0
spelling	A Survey of Audio Enhancement Algorithms for Music, Speech, Bioacoustics, Biomedical, Industrial, and Environmental Sounds by Image U-NetSania Gul (18272227)Muhammad Salman Khan (7202543)Information and computing sciencesArtificial intelligenceMachine learningCNNsimage processing deep neural networkspre-trained networksspectrogramU-NetConvolutional neural networksTime-domain analysisSpeech enhancementSpectrogramRecurrent neural networksMusicImage processingArtificial neural networks<p dir="ltr">The recent surge in the use of Deep Neural Networks (DNNs) has also made its mark in the field of Audio Enhancement (AE), providing much better quality than the classical methods. Although, there are dedicated audio processing DNNs, yet, many recent models of AE have utilized U-Net: a DNN based on Convolutional Neural Network (CNN), fundamentally developed for image segmentation. It is found that the useful features hidden in the time domain are highlighted when the audio signal is converted to a spectrogram, which can be treated as an image. In this article, we will review the recent work, utilizing U-Nets for different AE applications. Different than other published reviews, this review focuses entirely on AE techniques based on image U-Nets. We will discuss the need for AE, U-Net comparison to other DNNs, the benefits of converting the audio to 2D, input representations that are useful for different AE applications, the architecture of vanilla U-Net and the pre-trained models, variations in vanilla architecture incorporated in different E models, and the state-of-the-art AE algorithms based on U-Net in various applications. Apart from speech and music, this article discusses a wide range of audio signals e.g. environmental, biomedical, bioacoustics, and industrial sounds, not covered collectively in a single article in previously published studies. The article ends with the discussion of colored spectrograms in future AE applications.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2023.3344813" target="_blank">https://dx.doi.org/10.1109/access.2023.3344813</a></p>2023-12-27T15:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1109/access.2023.3344813https://figshare.com/articles/journal_contribution/A_Survey_of_Audio_Enhancement_Algorithms_for_Music_Speech_Bioacoustics_Biomedical_Industrial_and_Environmental_Sounds_by_Image_U-Net/29445581CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/294455812023-12-27T15:00:00Z
spellingShingle	A Survey of Audio Enhancement Algorithms for Music, Speech, Bioacoustics, Biomedical, Industrial, and Environmental Sounds by Image U-Net Sania Gul (18272227) Information and computing sciences Artificial intelligence Machine learning CNNs image processing deep neural networks pre-trained networks spectrogram U-Net Convolutional neural networks Time-domain analysis Speech enhancement Spectrogram Recurrent neural networks Music Image processing Artificial neural networks
status_str	publishedVersion
title	A Survey of Audio Enhancement Algorithms for Music, Speech, Bioacoustics, Biomedical, Industrial, and Environmental Sounds by Image U-Net
title_full	A Survey of Audio Enhancement Algorithms for Music, Speech, Bioacoustics, Biomedical, Industrial, and Environmental Sounds by Image U-Net
title_fullStr	A Survey of Audio Enhancement Algorithms for Music, Speech, Bioacoustics, Biomedical, Industrial, and Environmental Sounds by Image U-Net
title_full_unstemmed	A Survey of Audio Enhancement Algorithms for Music, Speech, Bioacoustics, Biomedical, Industrial, and Environmental Sounds by Image U-Net
title_short	A Survey of Audio Enhancement Algorithms for Music, Speech, Bioacoustics, Biomedical, Industrial, and Environmental Sounds by Image U-Net
title_sort	A Survey of Audio Enhancement Algorithms for Music, Speech, Bioacoustics, Biomedical, Industrial, and Environmental Sounds by Image U-Net
topic	Information and computing sciences Artificial intelligence Machine learning CNNs image processing deep neural networks pre-trained networks spectrogram U-Net Convolutional neural networks Time-domain analysis Speech enhancement Spectrogram Recurrent neural networks Music Image processing Artificial neural networks

A Survey of Audio Enhancement Algorithms for Music, Speech, Bioacoustics, Biomedical, Industrial, and Environmental Sounds by Image U-Net

مواد مشابهة