Autocleandeepfood: auto-cleaning and data balancing transfer learning for regional gastronomy food computing

<p dir="ltr">Food computing has emerged as a promising research field, employing artificial intelligence, deep learning, and data science methodologies to enhance various stages of food production pipelines. To this end, the food computing community has compiled a variety of data set...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Nauman Ullah Gilal (17302714) (author)
مؤلفون آخرون: Marwa Qaraqe (10135172) (author), Jens Schneider (16885948) (author), Marco Agus (8032898) (author)
منشور في: 2024
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513541533335552
author Nauman Ullah Gilal (17302714)
author2 Marwa Qaraqe (10135172)
Jens Schneider (16885948)
Marco Agus (8032898)
author2_role author
author
author
author_facet Nauman Ullah Gilal (17302714)
Marwa Qaraqe (10135172)
Jens Schneider (16885948)
Marco Agus (8032898)
author_role author
dc.creator.none.fl_str_mv Nauman Ullah Gilal (17302714)
Marwa Qaraqe (10135172)
Jens Schneider (16885948)
Marco Agus (8032898)
dc.date.none.fl_str_mv 2024-07-09T06:00:00Z
dc.identifier.none.fl_str_mv 10.1007/s00371-024-03560-7
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Autocleandeepfood_auto-cleaning_and_data_balancing_transfer_learning_for_regional_gastronomy_food_computing/29899604
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Agricultural, veterinary and food sciences
Food sciences
Information and computing sciences
Artificial intelligence
Human-centred computing
Machine learning
Food computing
Web-scrapping
Traditional cuisines
MENA food data set
Noisy labels
Transfer learning
Auto-cleaning
Data imbalance
dc.title.none.fl_str_mv Autocleandeepfood: auto-cleaning and data balancing transfer learning for regional gastronomy food computing
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p dir="ltr">Food computing has emerged as a promising research field, employing artificial intelligence, deep learning, and data science methodologies to enhance various stages of food production pipelines. To this end, the food computing community has compiled a variety of data sets and developed various deep-learning architectures to perform automatic classification. However, automated food classification presents a significant challenge, particularly when it comes to local and regional cuisines, which are often underrepresented in available public-domain data sets. Nevertheless, obtaining high-quality, well-labeled, and well-balanced real-world labeled images is challenging since manual data curation requires significant human effort and is time-consuming. In contrast, the web has a potentially unlimited source of food data but tapping into this resource has a good chance of corrupted and wrongly labeled images. In addition, the uneven distribution among food categories may lead to data imbalance problems. All these issues make it challenging to create clean data sets for food from web data. To address this issue, we present <i>AutoCleanDeepFood</i>, a novel end-to-end food computing framework for regional gastronomy that contains the following components: (i) a fully automated pre-processing pipeline for custom data sets creation related to specific regional gastronomy, (ii) a transfer learning-based training paradigm to filter out noisy labels through loss ranking, incorporating a Russian Roulette probabilistic approach to mitigate data imbalance problems, and (iii) a method for deploying the resulting model on smartphones for real-time inferences. We assess the performance of our framework on a real-world noisy public domain data set, ETH Food-101, and two novel web-collected datasets, MENA-150 and Pizza-Styles. We demonstrate the filtering capabilities of our proposed method through embedding visualization of the feature space using the t-SNE dimension reduction scheme. Our filtering scheme is efficient and effectively improves accuracy in all cases, boosting performance by 0.96, 0.71, and 1.29% on MENA-150, ETH Food-101, and Pizza-Styles, respectively.</p><h2>Other Information</h2><p dir="ltr">Published in: The Visual Computer<br>License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1007/s00371-024-03560-7" target="_blank">https://dx.doi.org/10.1007/s00371-024-03560-7</a></p>
eu_rights_str_mv openAccess
id Manara2_2b8f3b337720a86c811bdb3e9b5235f8
identifier_str_mv 10.1007/s00371-024-03560-7
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/29899604
publishDate 2024
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Autocleandeepfood: auto-cleaning and data balancing transfer learning for regional gastronomy food computingNauman Ullah Gilal (17302714)Marwa Qaraqe (10135172)Jens Schneider (16885948)Marco Agus (8032898)Agricultural, veterinary and food sciencesFood sciencesInformation and computing sciencesArtificial intelligenceHuman-centred computingMachine learningFood computingWeb-scrappingTraditional cuisinesMENA food data setNoisy labelsTransfer learningAuto-cleaningData imbalance<p dir="ltr">Food computing has emerged as a promising research field, employing artificial intelligence, deep learning, and data science methodologies to enhance various stages of food production pipelines. To this end, the food computing community has compiled a variety of data sets and developed various deep-learning architectures to perform automatic classification. However, automated food classification presents a significant challenge, particularly when it comes to local and regional cuisines, which are often underrepresented in available public-domain data sets. Nevertheless, obtaining high-quality, well-labeled, and well-balanced real-world labeled images is challenging since manual data curation requires significant human effort and is time-consuming. In contrast, the web has a potentially unlimited source of food data but tapping into this resource has a good chance of corrupted and wrongly labeled images. In addition, the uneven distribution among food categories may lead to data imbalance problems. All these issues make it challenging to create clean data sets for food from web data. To address this issue, we present <i>AutoCleanDeepFood</i>, a novel end-to-end food computing framework for regional gastronomy that contains the following components: (i) a fully automated pre-processing pipeline for custom data sets creation related to specific regional gastronomy, (ii) a transfer learning-based training paradigm to filter out noisy labels through loss ranking, incorporating a Russian Roulette probabilistic approach to mitigate data imbalance problems, and (iii) a method for deploying the resulting model on smartphones for real-time inferences. We assess the performance of our framework on a real-world noisy public domain data set, ETH Food-101, and two novel web-collected datasets, MENA-150 and Pizza-Styles. We demonstrate the filtering capabilities of our proposed method through embedding visualization of the feature space using the t-SNE dimension reduction scheme. Our filtering scheme is efficient and effectively improves accuracy in all cases, boosting performance by 0.96, 0.71, and 1.29% on MENA-150, ETH Food-101, and Pizza-Styles, respectively.</p><h2>Other Information</h2><p dir="ltr">Published in: The Visual Computer<br>License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1007/s00371-024-03560-7" target="_blank">https://dx.doi.org/10.1007/s00371-024-03560-7</a></p>2024-07-09T06:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1007/s00371-024-03560-7https://figshare.com/articles/journal_contribution/Autocleandeepfood_auto-cleaning_and_data_balancing_transfer_learning_for_regional_gastronomy_food_computing/29899604CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/298996042024-07-09T06:00:00Z
spellingShingle Autocleandeepfood: auto-cleaning and data balancing transfer learning for regional gastronomy food computing
Nauman Ullah Gilal (17302714)
Agricultural, veterinary and food sciences
Food sciences
Information and computing sciences
Artificial intelligence
Human-centred computing
Machine learning
Food computing
Web-scrapping
Traditional cuisines
MENA food data set
Noisy labels
Transfer learning
Auto-cleaning
Data imbalance
status_str publishedVersion
title Autocleandeepfood: auto-cleaning and data balancing transfer learning for regional gastronomy food computing
title_full Autocleandeepfood: auto-cleaning and data balancing transfer learning for regional gastronomy food computing
title_fullStr Autocleandeepfood: auto-cleaning and data balancing transfer learning for regional gastronomy food computing
title_full_unstemmed Autocleandeepfood: auto-cleaning and data balancing transfer learning for regional gastronomy food computing
title_short Autocleandeepfood: auto-cleaning and data balancing transfer learning for regional gastronomy food computing
title_sort Autocleandeepfood: auto-cleaning and data balancing transfer learning for regional gastronomy food computing
topic Agricultural, veterinary and food sciences
Food sciences
Information and computing sciences
Artificial intelligence
Human-centred computing
Machine learning
Food computing
Web-scrapping
Traditional cuisines
MENA food data set
Noisy labels
Transfer learning
Auto-cleaning
Data imbalance