DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems

With the web-scale data volumes and high velocity of generation rates, it has become crucial that the training process for recom mender systems be a continuous process which is performed on live data, i.e., on data streams. In practice, such systems have to address three main requirements including...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Hazem, Heidy (author)
مؤلفون آخرون: Awad, Ahmed (author), Hassan, Ahmed (author), Sakr, Sherif (author)
منشور في: 2020
الوصول للمادة أونلاين:https://bspace.buid.ac.ae/handle/1234/2927
https://openproceedings.org/2020/conf/edbt/paper_200.pdf
https://doi.org/10.5441/002/edbt.2020.32
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1862980619724652544
author Hazem, Heidy
author2 Awad, Ahmed
Hassan, Ahmed
Sakr, Sherif
author2_role author
author
author
author_facet Hazem, Heidy
Awad, Ahmed
Hassan, Ahmed
Sakr, Sherif
author_role author
dc.creator.none.fl_str_mv Hazem, Heidy
Awad, Ahmed
Hassan, Ahmed
Sakr, Sherif
dc.date.none.fl_str_mv 2020
2025-05-06T08:42:43Z
2025-05-06T08:42:43Z
dc.identifier.none.fl_str_mv Hazem H. et al. (2020) “DiSGD: A distributed shared-nothing matrix factorization for large scale online recommender systems,” Advances in Database Technology - EDBT, 2020-March, pp. 359–362.
2367-2005
https://bspace.buid.ac.ae/handle/1234/2927
https://openproceedings.org/2020/conf/edbt/paper_200.pdf
https://doi.org/10.5441/002/edbt.2020.32
dc.language.none.fl_str_mv en
dc.publisher.none.fl_str_mv Open proceedings org
dc.relation.none.fl_str_mv 23rd International Conference on Extending Database Technology
dc.title.none.fl_str_mv DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
dc.type.none.fl_str_mv Article
description With the web-scale data volumes and high velocity of generation rates, it has become crucial that the training process for recom mender systems be a continuous process which is performed on live data, i.e., on data streams. In practice, such systems have to address three main requirements including the ability to adapt their trained model with each incoming data element, the ability to handle concept drifts and the ability to scale with the volume of the data. In principle, matrix factorization is one of the popular approaches to train a recommender model. Stochastic Gradient Descent (SGD) has been a successful optimization approach for matrix factorization. Several approaches have been proposed that handle the first and second requirements. For the third require ment, in the realm of data streams, distributed approaches depend on a shared memory architecture. This requires obtaining locks before performing updates. In general, the success of main-stream big data processing systems is supported by their shared-nothing architecture. In this paper, we propose DISGD, a distributed shared-nothing variant of an incremental SGD. The proposal is motivated by an observation that with large volumes of data, the overwrite of updates, lock free updates, does not affect the result with sparse user-item matrices. Compared to the baseline incremental approach, our evaluation on several datasets shows not only improvement in processing time but also improved recall by 55%.
id budr_c0fe3ce4743dc26e943f331f76587f58
identifier_str_mv Hazem H. et al. (2020) “DiSGD: A distributed shared-nothing matrix factorization for large scale online recommender systems,” Advances in Database Technology - EDBT, 2020-March, pp. 359–362.
2367-2005
language_invalid_str_mv en
network_acronym_str budr
network_name_str The British University in Dubai repository
oai_identifier_str oai:bspace.buid.ac.ae:1234/2927
publishDate 2020
publisher.none.fl_str_mv Open proceedings org
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender SystemsHazem, HeidyAwad, AhmedHassan, AhmedSakr, SherifWith the web-scale data volumes and high velocity of generation rates, it has become crucial that the training process for recom mender systems be a continuous process which is performed on live data, i.e., on data streams. In practice, such systems have to address three main requirements including the ability to adapt their trained model with each incoming data element, the ability to handle concept drifts and the ability to scale with the volume of the data. In principle, matrix factorization is one of the popular approaches to train a recommender model. Stochastic Gradient Descent (SGD) has been a successful optimization approach for matrix factorization. Several approaches have been proposed that handle the first and second requirements. For the third require ment, in the realm of data streams, distributed approaches depend on a shared memory architecture. This requires obtaining locks before performing updates. In general, the success of main-stream big data processing systems is supported by their shared-nothing architecture. In this paper, we propose DISGD, a distributed shared-nothing variant of an incremental SGD. The proposal is motivated by an observation that with large volumes of data, the overwrite of updates, lock free updates, does not affect the result with sparse user-item matrices. Compared to the baseline incremental approach, our evaluation on several datasets shows not only improvement in processing time but also improved recall by 55%.Open proceedings org2025-05-06T08:42:43Z2025-05-06T08:42:43Z2020ArticleHazem H. et al. (2020) “DiSGD: A distributed shared-nothing matrix factorization for large scale online recommender systems,” Advances in Database Technology - EDBT, 2020-March, pp. 359–362.2367-2005https://bspace.buid.ac.ae/handle/1234/2927https://openproceedings.org/2020/conf/edbt/paper_200.pdfhttps://doi.org/10.5441/002/edbt.2020.32en23rd International Conference on Extending Database Technologyoai:bspace.buid.ac.ae:1234/29272025-08-13T13:00:06Z
spellingShingle DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
Hazem, Heidy
title DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
title_full DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
title_fullStr DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
title_full_unstemmed DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
title_short DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
title_sort DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
url https://bspace.buid.ac.ae/handle/1234/2927
https://openproceedings.org/2020/conf/edbt/paper_200.pdf
https://doi.org/10.5441/002/edbt.2020.32