Shuffled Linear Regression with Erroneous Observations

Linear regression with shuffled labels is the problem of performing a linear regression fit on datasets whose labels are unknowingly shuffled with respect to their inputs. Such a problem relates to different applications such as genome sequence assembly, sampling and reconstruction of spatial fields...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Saab, Samer S. (author)
مؤلفون آخرون: Saab, Khaled Kamal (author), Saab, Samer S. Jr. (author)
التنسيق: conferenceObject
منشور في: 2019
الموضوعات:
الوصول للمادة أونلاين:http://hdl.handle.net/10725/11137
http://dx.doi.org/10.1109/CISS.2019.8692838
http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php
https://ieeexplore.ieee.org/abstract/document/8692838
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513488306569216
author Saab, Samer S.
author2 Saab, Khaled Kamal
Saab, Samer S. Jr.
author2_role author
author
author_facet Saab, Samer S.
Saab, Khaled Kamal
Saab, Samer S. Jr.
author_role author
dc.creator.none.fl_str_mv Saab, Samer S.
Saab, Khaled Kamal
Saab, Samer S. Jr.
dc.date.none.fl_str_mv 2019-07-24T10:48:42Z
2019-07-24T10:48:42Z
2019
2019-07-24
dc.identifier.none.fl_str_mv 9781728111513
http://hdl.handle.net/10725/11137
http://dx.doi.org/10.1109/CISS.2019.8692838
Saab, S. S., & Saab, K. K. (2019, March). Shuffled Linear Regression with Erroneous Observations. In 2019 53rd Annual Conference on Information Sciences and Systems (CISS) (pp. 1-6). IEEE.
http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php
https://ieeexplore.ieee.org/abstract/document/8692838
dc.language.none.fl_str_mv en
dc.publisher.none.fl_str_mv IEEE
dc.rights.*.fl_str_mv info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Information science -- Congresses
Telecommunication systems -- Congresses
Electrical engineering -- Congresses
Information theory -- Congresses
dc.title.none.fl_str_mv Shuffled Linear Regression with Erroneous Observations
dc.type.none.fl_str_mv Conference Paper / Proceeding
info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/conferenceObject
description Linear regression with shuffled labels is the problem of performing a linear regression fit on datasets whose labels are unknowingly shuffled with respect to their inputs. Such a problem relates to different applications such as genome sequence assembly, sampling and reconstruction of spatial fields, and communication networks. Existing methods are either applicable only to data with limited observation errors, work only for partially shuffled data, sensitive to initialization, and/or work only with small dimensions. This paper tackles this problem in its full generality using stochastic approximation, which is based on a first-order permutation-invariant constraint. We propose an optimal recursive algorithm that updates the estimate from the underdetermined function that is based on that permutation-invariant constraint. The proposed algorithm aims for per-iteration minimization of the mean square estimate error. Although our algorithm is sensitive to initialization errors, to the best of our knowledge, the resulting method is the first working solution for arbitrary large dimensions and arbitrary large observation errors while its computation throughput appears insignificant. Numerical simulations show that our method with shuffled datasets can outperform the ordinary least squares method without shuffling. We also consider a batch process to this problem where the datasets are independently available. The solution we propose is independent of initialization but requires that number of such datasets to be at least equal to the dimension of the unknown vector.
eu_rights_str_mv openAccess
format conferenceObject
id LAURepo_1a0e902b26eaa40317a971fbb65329d5
identifier_str_mv 9781728111513
Saab, S. S., & Saab, K. K. (2019, March). Shuffled Linear Regression with Erroneous Observations. In 2019 53rd Annual Conference on Information Sciences and Systems (CISS) (pp. 1-6). IEEE.
language_invalid_str_mv en
network_acronym_str LAURepo
network_name_str Lebanese American University repository
oai_identifier_str oai:laur.lau.edu.lb:10725/11137
publishDate 2019
publisher.none.fl_str_mv IEEE
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling Shuffled Linear Regression with Erroneous ObservationsSaab, Samer S.Saab, Khaled KamalSaab, Samer S. Jr.Information science -- CongressesTelecommunication systems -- CongressesElectrical engineering -- CongressesInformation theory -- CongressesLinear regression with shuffled labels is the problem of performing a linear regression fit on datasets whose labels are unknowingly shuffled with respect to their inputs. Such a problem relates to different applications such as genome sequence assembly, sampling and reconstruction of spatial fields, and communication networks. Existing methods are either applicable only to data with limited observation errors, work only for partially shuffled data, sensitive to initialization, and/or work only with small dimensions. This paper tackles this problem in its full generality using stochastic approximation, which is based on a first-order permutation-invariant constraint. We propose an optimal recursive algorithm that updates the estimate from the underdetermined function that is based on that permutation-invariant constraint. The proposed algorithm aims for per-iteration minimization of the mean square estimate error. Although our algorithm is sensitive to initialization errors, to the best of our knowledge, the resulting method is the first working solution for arbitrary large dimensions and arbitrary large observation errors while its computation throughput appears insignificant. Numerical simulations show that our method with shuffled datasets can outperform the ordinary least squares method without shuffling. We also consider a batch process to this problem where the datasets are independently available. The solution we propose is independent of initialization but requires that number of such datasets to be at least equal to the dimension of the unknown vector.IEEE Information Theory SocietyN/AIncludes bibliographical references.IEEE2019-07-24T10:48:42Z2019-07-24T10:48:42Z20192019-07-24Conference Paper / Proceedinginfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject9781728111513http://hdl.handle.net/10725/11137http://dx.doi.org/10.1109/CISS.2019.8692838Saab, S. S., & Saab, K. K. (2019, March). Shuffled Linear Regression with Erroneous Observations. In 2019 53rd Annual Conference on Information Sciences and Systems (CISS) (pp. 1-6). IEEE.http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.phphttps://ieeexplore.ieee.org/abstract/document/8692838eninfo:eu-repo/semantics/openAccessoai:laur.lau.edu.lb:10725/111372021-03-19T10:47:35Z
spellingShingle Shuffled Linear Regression with Erroneous Observations
Saab, Samer S.
Information science -- Congresses
Telecommunication systems -- Congresses
Electrical engineering -- Congresses
Information theory -- Congresses
status_str publishedVersion
title Shuffled Linear Regression with Erroneous Observations
title_full Shuffled Linear Regression with Erroneous Observations
title_fullStr Shuffled Linear Regression with Erroneous Observations
title_full_unstemmed Shuffled Linear Regression with Erroneous Observations
title_short Shuffled Linear Regression with Erroneous Observations
title_sort Shuffled Linear Regression with Erroneous Observations
topic Information science -- Congresses
Telecommunication systems -- Congresses
Electrical engineering -- Congresses
Information theory -- Congresses
url http://hdl.handle.net/10725/11137
http://dx.doi.org/10.1109/CISS.2019.8692838
http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php
https://ieeexplore.ieee.org/abstract/document/8692838