Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection

<p dir="ltr">Deepfake voice refers to artificially generated or manipulated audio that mimics a person’s voice, often created using advanced AI techniques. These synthetic voices can be used to convincingly imitate someone, making them nearly indistinguishable from genuine recordings...

Full description

Saved in:
Bibliographic Details
Main Author: Muhammad Usama Tanveer Gujjar (22282840) (author)
Other Authors: Kashif Munir (6182237) (author), Madiha Amjad (20036518) (author), Atiq Ur Rehman (8843024) (author), Amine Bermak (1895947) (author)
Published: 2024
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1864513513502801920
author Muhammad Usama Tanveer Gujjar (22282840)
author2 Kashif Munir (6182237)
Madiha Amjad (20036518)
Atiq Ur Rehman (8843024)
Amine Bermak (1895947)
author2_role author
author
author
author
author_facet Muhammad Usama Tanveer Gujjar (22282840)
Kashif Munir (6182237)
Madiha Amjad (20036518)
Atiq Ur Rehman (8843024)
Amine Bermak (1895947)
author_role author
dc.creator.none.fl_str_mv Muhammad Usama Tanveer Gujjar (22282840)
Kashif Munir (6182237)
Madiha Amjad (20036518)
Atiq Ur Rehman (8843024)
Amine Bermak (1895947)
dc.date.none.fl_str_mv 2024-12-31T12:00:00Z
dc.identifier.none.fl_str_mv 10.1109/access.2024.3521026
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Unmasking_the_Fake_Machine_Learning_Approach_for_Deepfake_Voice_Detection/30173497
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Engineering
Communications engineering
Information and computing sciences
Artificial intelligence
Cybersecurity and privacy
Machine learning
Deep fake voice
machine learning
MFCC-GNB XtractNet
transfer learning
Feature extraction
Deepfakes
Accuracy
Mel frequency cepstral coefficient
Machine learning
Training
Deep learning
Recording
Gaussian processes
Data models
dc.title.none.fl_str_mv Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p dir="ltr">Deepfake voice refers to artificially generated or manipulated audio that mimics a person’s voice, often created using advanced AI techniques. These synthetic voices can be used to convincingly imitate someone, making them nearly indistinguishable from genuine recordings. We present an advanced method for deepfake voice detection, leveraging a custom model named MFCC-GNB XtractNet. By extracting Mel-Frequency Cepstral Coefficients (MFCC) from audio samples which serve as the foundational features for identifying genuine and fake voices. These MFCC features are then enhanced through a transformation process that employs a Gaussian Naive Bayes (GNB) model in conjunction with Non-Negative Factorization, creating a more discriminative feature set for subsequent analysis. These features are fed to our developed model, MFCC-GNB XtractNet to identify deep fake voice.To rigorously evaluate the effectiveness of our approach, we deployed a range of machine learning models, including Random Forest (RF), K-Nearest Neighbors Classifier (KNC), Logistic Regression (LR) and Gaussian Naive Bayes (GNB). Each model’s performance is assessed through k-fold cross-validation, ensuring a robust evaluation across multiple data splits. Additionally, we performed a computational cost analysis to measure the efficiency of the models in terms of training time and resource usage. The results of our experiments were highly promising, with our MFCC-GNB XtractNet + GNB model achieving an impressive accuracy score of 99.93%. This exceptional performance underscores the model’s ability to effectively distinguish between real and deepfake voices setting a new benchmark in the field of voice authentication.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2024.3521026" target="_blank">https://dx.doi.org/10.1109/access.2024.3521026</a></p>
eu_rights_str_mv openAccess
id Manara2_6bb9dbd8c6d72af5a3dd405767c27bb3
identifier_str_mv 10.1109/access.2024.3521026
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/30173497
publishDate 2024
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Unmasking the Fake: Machine Learning Approach for Deepfake Voice DetectionMuhammad Usama Tanveer Gujjar (22282840)Kashif Munir (6182237)Madiha Amjad (20036518)Atiq Ur Rehman (8843024)Amine Bermak (1895947)EngineeringCommunications engineeringInformation and computing sciencesArtificial intelligenceCybersecurity and privacyMachine learningDeep fake voicemachine learningMFCC-GNB XtractNettransfer learningFeature extractionDeepfakesAccuracyMel frequency cepstral coefficientMachine learningTrainingDeep learningRecordingGaussian processesData models<p dir="ltr">Deepfake voice refers to artificially generated or manipulated audio that mimics a person’s voice, often created using advanced AI techniques. These synthetic voices can be used to convincingly imitate someone, making them nearly indistinguishable from genuine recordings. We present an advanced method for deepfake voice detection, leveraging a custom model named MFCC-GNB XtractNet. By extracting Mel-Frequency Cepstral Coefficients (MFCC) from audio samples which serve as the foundational features for identifying genuine and fake voices. These MFCC features are then enhanced through a transformation process that employs a Gaussian Naive Bayes (GNB) model in conjunction with Non-Negative Factorization, creating a more discriminative feature set for subsequent analysis. These features are fed to our developed model, MFCC-GNB XtractNet to identify deep fake voice.To rigorously evaluate the effectiveness of our approach, we deployed a range of machine learning models, including Random Forest (RF), K-Nearest Neighbors Classifier (KNC), Logistic Regression (LR) and Gaussian Naive Bayes (GNB). Each model’s performance is assessed through k-fold cross-validation, ensuring a robust evaluation across multiple data splits. Additionally, we performed a computational cost analysis to measure the efficiency of the models in terms of training time and resource usage. The results of our experiments were highly promising, with our MFCC-GNB XtractNet + GNB model achieving an impressive accuracy score of 99.93%. This exceptional performance underscores the model’s ability to effectively distinguish between real and deepfake voices setting a new benchmark in the field of voice authentication.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2024.3521026" target="_blank">https://dx.doi.org/10.1109/access.2024.3521026</a></p>2024-12-31T12:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1109/access.2024.3521026https://figshare.com/articles/journal_contribution/Unmasking_the_Fake_Machine_Learning_Approach_for_Deepfake_Voice_Detection/30173497CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/301734972024-12-31T12:00:00Z
spellingShingle Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection
Muhammad Usama Tanveer Gujjar (22282840)
Engineering
Communications engineering
Information and computing sciences
Artificial intelligence
Cybersecurity and privacy
Machine learning
Deep fake voice
machine learning
MFCC-GNB XtractNet
transfer learning
Feature extraction
Deepfakes
Accuracy
Mel frequency cepstral coefficient
Machine learning
Training
Deep learning
Recording
Gaussian processes
Data models
status_str publishedVersion
title Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection
title_full Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection
title_fullStr Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection
title_full_unstemmed Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection
title_short Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection
title_sort Unmasking the Fake: Machine Learning Approach for Deepfake Voice Detection
topic Engineering
Communications engineering
Information and computing sciences
Artificial intelligence
Cybersecurity and privacy
Machine learning
Deep fake voice
machine learning
MFCC-GNB XtractNet
transfer learning
Feature extraction
Deepfakes
Accuracy
Mel frequency cepstral coefficient
Machine learning
Training
Deep learning
Recording
Gaussian processes
Data models