Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection
<p dir="ltr">Deepfake voice refers to artificially generated or manipulated audio that mimics a person’s voice, often created using advanced AI techniques. These synthetic voices can be used to convincingly imitate someone, making them nearly indistinguishable from genuine recordings...
Saved in:
| Main Author: | |
|---|---|
| Other Authors: | , , , |
| Published: |
2024
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1864513513502801920 |
|---|---|
| author | Muhammad Usama Tanveer Gujjar (22282840) |
| author2 | Kashif Munir (6182237) Madiha Amjad (20036518) Atiq Ur Rehman (8843024) Amine Bermak (1895947) |
| author2_role | author author author author |
| author_facet | Muhammad Usama Tanveer Gujjar (22282840) Kashif Munir (6182237) Madiha Amjad (20036518) Atiq Ur Rehman (8843024) Amine Bermak (1895947) |
| author_role | author |
| dc.creator.none.fl_str_mv | Muhammad Usama Tanveer Gujjar (22282840) Kashif Munir (6182237) Madiha Amjad (20036518) Atiq Ur Rehman (8843024) Amine Bermak (1895947) |
| dc.date.none.fl_str_mv | 2024-12-31T12:00:00Z |
| dc.identifier.none.fl_str_mv | 10.1109/access.2024.3521026 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/journal_contribution/Unmasking_the_Fake_Machine_Learning_Approach_for_Deepfake_Voice_Detection/30173497 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Engineering Communications engineering Information and computing sciences Artificial intelligence Cybersecurity and privacy Machine learning Deep fake voice machine learning MFCC-GNB XtractNet transfer learning Feature extraction Deepfakes Accuracy Mel frequency cepstral coefficient Machine learning Training Deep learning Recording Gaussian processes Data models |
| dc.title.none.fl_str_mv | Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection |
| dc.type.none.fl_str_mv | Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal |
| description | <p dir="ltr">Deepfake voice refers to artificially generated or manipulated audio that mimics a person’s voice, often created using advanced AI techniques. These synthetic voices can be used to convincingly imitate someone, making them nearly indistinguishable from genuine recordings. We present an advanced method for deepfake voice detection, leveraging a custom model named MFCC-GNB XtractNet. By extracting Mel-Frequency Cepstral Coefficients (MFCC) from audio samples which serve as the foundational features for identifying genuine and fake voices. These MFCC features are then enhanced through a transformation process that employs a Gaussian Naive Bayes (GNB) model in conjunction with Non-Negative Factorization, creating a more discriminative feature set for subsequent analysis. These features are fed to our developed model, MFCC-GNB XtractNet to identify deep fake voice.To rigorously evaluate the effectiveness of our approach, we deployed a range of machine learning models, including Random Forest (RF), K-Nearest Neighbors Classifier (KNC), Logistic Regression (LR) and Gaussian Naive Bayes (GNB). Each model’s performance is assessed through k-fold cross-validation, ensuring a robust evaluation across multiple data splits. Additionally, we performed a computational cost analysis to measure the efficiency of the models in terms of training time and resource usage. The results of our experiments were highly promising, with our MFCC-GNB XtractNet + GNB model achieving an impressive accuracy score of 99.93%. This exceptional performance underscores the model’s ability to effectively distinguish between real and deepfake voices setting a new benchmark in the field of voice authentication.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2024.3521026" target="_blank">https://dx.doi.org/10.1109/access.2024.3521026</a></p> |
| eu_rights_str_mv | openAccess |
| id | Manara2_6bb9dbd8c6d72af5a3dd405767c27bb3 |
| identifier_str_mv | 10.1109/access.2024.3521026 |
| network_acronym_str | Manara2 |
| network_name_str | Manara2 |
| oai_identifier_str | oai:figshare.com:article/30173497 |
| publishDate | 2024 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Unmasking the Fake: Machine Learning Approach for Deepfake Voice DetectionMuhammad Usama Tanveer Gujjar (22282840)Kashif Munir (6182237)Madiha Amjad (20036518)Atiq Ur Rehman (8843024)Amine Bermak (1895947)EngineeringCommunications engineeringInformation and computing sciencesArtificial intelligenceCybersecurity and privacyMachine learningDeep fake voicemachine learningMFCC-GNB XtractNettransfer learningFeature extractionDeepfakesAccuracyMel frequency cepstral coefficientMachine learningTrainingDeep learningRecordingGaussian processesData models<p dir="ltr">Deepfake voice refers to artificially generated or manipulated audio that mimics a person’s voice, often created using advanced AI techniques. These synthetic voices can be used to convincingly imitate someone, making them nearly indistinguishable from genuine recordings. We present an advanced method for deepfake voice detection, leveraging a custom model named MFCC-GNB XtractNet. By extracting Mel-Frequency Cepstral Coefficients (MFCC) from audio samples which serve as the foundational features for identifying genuine and fake voices. These MFCC features are then enhanced through a transformation process that employs a Gaussian Naive Bayes (GNB) model in conjunction with Non-Negative Factorization, creating a more discriminative feature set for subsequent analysis. These features are fed to our developed model, MFCC-GNB XtractNet to identify deep fake voice.To rigorously evaluate the effectiveness of our approach, we deployed a range of machine learning models, including Random Forest (RF), K-Nearest Neighbors Classifier (KNC), Logistic Regression (LR) and Gaussian Naive Bayes (GNB). Each model’s performance is assessed through k-fold cross-validation, ensuring a robust evaluation across multiple data splits. Additionally, we performed a computational cost analysis to measure the efficiency of the models in terms of training time and resource usage. The results of our experiments were highly promising, with our MFCC-GNB XtractNet + GNB model achieving an impressive accuracy score of 99.93%. This exceptional performance underscores the model’s ability to effectively distinguish between real and deepfake voices setting a new benchmark in the field of voice authentication.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2024.3521026" target="_blank">https://dx.doi.org/10.1109/access.2024.3521026</a></p>2024-12-31T12:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1109/access.2024.3521026https://figshare.com/articles/journal_contribution/Unmasking_the_Fake_Machine_Learning_Approach_for_Deepfake_Voice_Detection/30173497CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/301734972024-12-31T12:00:00Z |
| spellingShingle | Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection Muhammad Usama Tanveer Gujjar (22282840) Engineering Communications engineering Information and computing sciences Artificial intelligence Cybersecurity and privacy Machine learning Deep fake voice machine learning MFCC-GNB XtractNet transfer learning Feature extraction Deepfakes Accuracy Mel frequency cepstral coefficient Machine learning Training Deep learning Recording Gaussian processes Data models |
| status_str | publishedVersion |
| title | Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection |
| title_full | Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection |
| title_fullStr | Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection |
| title_full_unstemmed | Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection |
| title_short | Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection |
| title_sort | Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection |
| topic | Engineering Communications engineering Information and computing sciences Artificial intelligence Cybersecurity and privacy Machine learning Deep fake voice machine learning MFCC-GNB XtractNet transfer learning Feature extraction Deepfakes Accuracy Mel frequency cepstral coefficient Machine learning Training Deep learning Recording Gaussian processes Data models |