DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems
With the web-scale data volumes and high velocity of generation rates, it has become crucial that the training process for recom mender systems be a continuous process which is performed on live data, i.e., on data streams. In practice, such systems have to address three main requirements including...
محفوظ في:
| المؤلف الرئيسي: | |
|---|---|
| مؤلفون آخرون: | , , |
| منشور في: |
2020
|
| الوصول للمادة أونلاين: | https://bspace.buid.ac.ae/handle/1234/2927 https://openproceedings.org/2020/conf/edbt/paper_200.pdf https://doi.org/10.5441/002/edbt.2020.32 |
| الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| _version_ | 1862980619724652544 |
|---|---|
| author | Hazem, Heidy |
| author2 | Awad, Ahmed Hassan, Ahmed Sakr, Sherif |
| author2_role | author author author |
| author_facet | Hazem, Heidy Awad, Ahmed Hassan, Ahmed Sakr, Sherif |
| author_role | author |
| dc.creator.none.fl_str_mv | Hazem, Heidy Awad, Ahmed Hassan, Ahmed Sakr, Sherif |
| dc.date.none.fl_str_mv | 2020 2025-05-06T08:42:43Z 2025-05-06T08:42:43Z |
| dc.identifier.none.fl_str_mv | Hazem H. et al. (2020) “DiSGD: A distributed shared-nothing matrix factorization for large scale online recommender systems,” Advances in Database Technology - EDBT, 2020-March, pp. 359–362. 2367-2005 https://bspace.buid.ac.ae/handle/1234/2927 https://openproceedings.org/2020/conf/edbt/paper_200.pdf https://doi.org/10.5441/002/edbt.2020.32 |
| dc.language.none.fl_str_mv | en |
| dc.publisher.none.fl_str_mv | Open proceedings org |
| dc.relation.none.fl_str_mv | 23rd International Conference on Extending Database Technology |
| dc.title.none.fl_str_mv | DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems |
| dc.type.none.fl_str_mv | Article |
| description | With the web-scale data volumes and high velocity of generation rates, it has become crucial that the training process for recom mender systems be a continuous process which is performed on live data, i.e., on data streams. In practice, such systems have to address three main requirements including the ability to adapt their trained model with each incoming data element, the ability to handle concept drifts and the ability to scale with the volume of the data. In principle, matrix factorization is one of the popular approaches to train a recommender model. Stochastic Gradient Descent (SGD) has been a successful optimization approach for matrix factorization. Several approaches have been proposed that handle the first and second requirements. For the third require ment, in the realm of data streams, distributed approaches depend on a shared memory architecture. This requires obtaining locks before performing updates. In general, the success of main-stream big data processing systems is supported by their shared-nothing architecture. In this paper, we propose DISGD, a distributed shared-nothing variant of an incremental SGD. The proposal is motivated by an observation that with large volumes of data, the overwrite of updates, lock free updates, does not affect the result with sparse user-item matrices. Compared to the baseline incremental approach, our evaluation on several datasets shows not only improvement in processing time but also improved recall by 55%. |
| id | budr_c0fe3ce4743dc26e943f331f76587f58 |
| identifier_str_mv | Hazem H. et al. (2020) “DiSGD: A distributed shared-nothing matrix factorization for large scale online recommender systems,” Advances in Database Technology - EDBT, 2020-March, pp. 359–362. 2367-2005 |
| language_invalid_str_mv | en |
| network_acronym_str | budr |
| network_name_str | The British University in Dubai repository |
| oai_identifier_str | oai:bspace.buid.ac.ae:1234/2927 |
| publishDate | 2020 |
| publisher.none.fl_str_mv | Open proceedings org |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| spelling | DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender SystemsHazem, HeidyAwad, AhmedHassan, AhmedSakr, SherifWith the web-scale data volumes and high velocity of generation rates, it has become crucial that the training process for recom mender systems be a continuous process which is performed on live data, i.e., on data streams. In practice, such systems have to address three main requirements including the ability to adapt their trained model with each incoming data element, the ability to handle concept drifts and the ability to scale with the volume of the data. In principle, matrix factorization is one of the popular approaches to train a recommender model. Stochastic Gradient Descent (SGD) has been a successful optimization approach for matrix factorization. Several approaches have been proposed that handle the first and second requirements. For the third require ment, in the realm of data streams, distributed approaches depend on a shared memory architecture. This requires obtaining locks before performing updates. In general, the success of main-stream big data processing systems is supported by their shared-nothing architecture. In this paper, we propose DISGD, a distributed shared-nothing variant of an incremental SGD. The proposal is motivated by an observation that with large volumes of data, the overwrite of updates, lock free updates, does not affect the result with sparse user-item matrices. Compared to the baseline incremental approach, our evaluation on several datasets shows not only improvement in processing time but also improved recall by 55%.Open proceedings org2025-05-06T08:42:43Z2025-05-06T08:42:43Z2020ArticleHazem H. et al. (2020) “DiSGD: A distributed shared-nothing matrix factorization for large scale online recommender systems,” Advances in Database Technology - EDBT, 2020-March, pp. 359–362.2367-2005https://bspace.buid.ac.ae/handle/1234/2927https://openproceedings.org/2020/conf/edbt/paper_200.pdfhttps://doi.org/10.5441/002/edbt.2020.32en23rd International Conference on Extending Database Technologyoai:bspace.buid.ac.ae:1234/29272025-08-13T13:00:06Z |
| spellingShingle | DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems Hazem, Heidy |
| title | DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems |
| title_full | DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems |
| title_fullStr | DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems |
| title_full_unstemmed | DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems |
| title_short | DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems |
| title_sort | DISGD: A Distributed Shared-nothing Matrix Factorization for Large Scale Online Recommender Systems |
| url | https://bspace.buid.ac.ae/handle/1234/2927 https://openproceedings.org/2020/conf/edbt/paper_200.pdf https://doi.org/10.5441/002/edbt.2020.32 |