DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems

With the web-scale data volumes and high velocity of generation rates, it has become crucial that the training process for recom mender systems be a continuous process which is performed on live data, i.e., on data streams. In practice, such systems have to address three main requirements including...

Full description

Saved in:

Bibliographic Details
Main Author:	Hazem, Heidy (author)
Other Authors:	Awad, Ahmed (author), Hassan, Ahmed (author), Sakr, Sherif (author)
Published:	2020
Online Access:	https://bspace.buid.ac.ae/handle/1234/2927 https://openproceedings.org/2020/conf/edbt/paper_200.pdf https://doi.org/10.5441/002/edbt.2020.32
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1862980619724652544
author	Hazem, Heidy
author2	Awad, Ahmed Hassan, Ahmed Sakr, Sherif
author2_role	author author author
author_facet	Hazem, Heidy Awad, Ahmed Hassan, Ahmed Sakr, Sherif
author_role	author
dc.creator.none.fl_str_mv	Hazem, Heidy Awad, Ahmed Hassan, Ahmed Sakr, Sherif
dc.date.none.fl_str_mv	2020 2025-05-06T08:42:43Z 2025-05-06T08:42:43Z
dc.identifier.none.fl_str_mv	Hazem H. et al. (2020) “DiSGD: A distributed shared-nothing matrix factorization for large scale online recommender systems,” Advances in Database Technology - EDBT, 2020-March, pp. 359–362. 2367-2005 https://bspace.buid.ac.ae/handle/1234/2927 https://openproceedings.org/2020/conf/edbt/paper_200.pdf https://doi.org/10.5441/002/edbt.2020.32
dc.language.none.fl_str_mv	en
dc.publisher.none.fl_str_mv	Open proceedings org
dc.relation.none.fl_str_mv	23rd International Conference on Extending Database Technology
dc.title.none.fl_str_mv	DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
dc.type.none.fl_str_mv	Article
description	With the web-scale data volumes and high velocity of generation rates, it has become crucial that the training process for recom mender systems be a continuous process which is performed on live data, i.e., on data streams. In practice, such systems have to address three main requirements including the ability to adapt their trained model with each incoming data element, the ability to handle concept drifts and the ability to scale with the volume of the data. In principle, matrix factorization is one of the popular approaches to train a recommender model. Stochastic Gradient Descent (SGD) has been a successful optimization approach for matrix factorization. Several approaches have been proposed that handle the first and second requirements. For the third require ment, in the realm of data streams, distributed approaches depend on a shared memory architecture. This requires obtaining locks before performing updates. In general, the success of main-stream big data processing systems is supported by their shared-nothing architecture. In this paper, we propose DISGD, a distributed shared-nothing variant of an incremental SGD. The proposal is motivated by an observation that with large volumes of data, the overwrite of updates, lock free updates, does not affect the result with sparse user-item matrices. Compared to the baseline incremental approach, our evaluation on several datasets shows not only improvement in processing time but also improved recall by 55%.
id	budr_c0fe3ce4743dc26e943f331f76587f58
identifier_str_mv	Hazem H. et al. (2020) “DiSGD: A distributed shared-nothing matrix factorization for large scale online recommender systems,” Advances in Database Technology - EDBT, 2020-March, pp. 359–362. 2367-2005
language_invalid_str_mv	en
network_acronym_str	budr
network_name_str	The British University in Dubai repository
oai_identifier_str	oai:bspace.buid.ac.ae:1234/2927
publishDate	2020
publisher.none.fl_str_mv	Open proceedings org
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling	DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender SystemsHazem, HeidyAwad, AhmedHassan, AhmedSakr, SherifWith the web-scale data volumes and high velocity of generation rates, it has become crucial that the training process for recom mender systems be a continuous process which is performed on live data, i.e., on data streams. In practice, such systems have to address three main requirements including the ability to adapt their trained model with each incoming data element, the ability to handle concept drifts and the ability to scale with the volume of the data. In principle, matrix factorization is one of the popular approaches to train a recommender model. Stochastic Gradient Descent (SGD) has been a successful optimization approach for matrix factorization. Several approaches have been proposed that handle the first and second requirements. For the third require ment, in the realm of data streams, distributed approaches depend on a shared memory architecture. This requires obtaining locks before performing updates. In general, the success of main-stream big data processing systems is supported by their shared-nothing architecture. In this paper, we propose DISGD, a distributed shared-nothing variant of an incremental SGD. The proposal is motivated by an observation that with large volumes of data, the overwrite of updates, lock free updates, does not affect the result with sparse user-item matrices. Compared to the baseline incremental approach, our evaluation on several datasets shows not only improvement in processing time but also improved recall by 55%.Open proceedings org2025-05-06T08:42:43Z2025-05-06T08:42:43Z2020ArticleHazem H. et al. (2020) “DiSGD: A distributed shared-nothing matrix factorization for large scale online recommender systems,” Advances in Database Technology - EDBT, 2020-March, pp. 359–362.2367-2005https://bspace.buid.ac.ae/handle/1234/2927https://openproceedings.org/2020/conf/edbt/paper_200.pdfhttps://doi.org/10.5441/002/edbt.2020.32en23rd International Conference on Extending Database Technologyoai:bspace.buid.ac.ae:1234/29272025-08-13T13:00:06Z
spellingShingle	DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems Hazem, Heidy
title	DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
title_full	DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
title_fullStr	DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
title_full_unstemmed	DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
title_short	DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
title_sort	DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
url	https://bspace.buid.ac.ae/handle/1234/2927 https://openproceedings.org/2020/conf/edbt/paper_200.pdf https://doi.org/10.5441/002/edbt.2020.32

DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems

Similar Items