Shuffled Linear Regression with Erroneous Observations
Linear regression with shuffled labels is the problem of performing a linear regression fit on datasets whose labels are unknowingly shuffled with respect to their inputs. Such a problem relates to different applications such as genome sequence assembly, sampling and reconstruction of spatial fields...
محفوظ في:
| المؤلف الرئيسي: | |
|---|---|
| مؤلفون آخرون: | , |
| التنسيق: | conferenceObject |
| منشور في: |
2019
|
| الموضوعات: | |
| الوصول للمادة أونلاين: | http://hdl.handle.net/10725/11137 http://dx.doi.org/10.1109/CISS.2019.8692838 http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php https://ieeexplore.ieee.org/abstract/document/8692838 |
| الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| _version_ | 1864513488306569216 |
|---|---|
| author | Saab, Samer S. |
| author2 | Saab, Khaled Kamal Saab, Samer S. Jr. |
| author2_role | author author |
| author_facet | Saab, Samer S. Saab, Khaled Kamal Saab, Samer S. Jr. |
| author_role | author |
| dc.creator.none.fl_str_mv | Saab, Samer S. Saab, Khaled Kamal Saab, Samer S. Jr. |
| dc.date.none.fl_str_mv | 2019-07-24T10:48:42Z 2019-07-24T10:48:42Z 2019 2019-07-24 |
| dc.identifier.none.fl_str_mv | 9781728111513 http://hdl.handle.net/10725/11137 http://dx.doi.org/10.1109/CISS.2019.8692838 Saab, S. S., & Saab, K. K. (2019, March). Shuffled Linear Regression with Erroneous Observations. In 2019 53rd Annual Conference on Information Sciences and Systems (CISS) (pp. 1-6). IEEE. http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php https://ieeexplore.ieee.org/abstract/document/8692838 |
| dc.language.none.fl_str_mv | en |
| dc.publisher.none.fl_str_mv | IEEE |
| dc.rights.*.fl_str_mv | info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Information science -- Congresses Telecommunication systems -- Congresses Electrical engineering -- Congresses Information theory -- Congresses |
| dc.title.none.fl_str_mv | Shuffled Linear Regression with Erroneous Observations |
| dc.type.none.fl_str_mv | Conference Paper / Proceeding info:eu-repo/semantics/publishedVersion info:eu-repo/semantics/conferenceObject |
| description | Linear regression with shuffled labels is the problem of performing a linear regression fit on datasets whose labels are unknowingly shuffled with respect to their inputs. Such a problem relates to different applications such as genome sequence assembly, sampling and reconstruction of spatial fields, and communication networks. Existing methods are either applicable only to data with limited observation errors, work only for partially shuffled data, sensitive to initialization, and/or work only with small dimensions. This paper tackles this problem in its full generality using stochastic approximation, which is based on a first-order permutation-invariant constraint. We propose an optimal recursive algorithm that updates the estimate from the underdetermined function that is based on that permutation-invariant constraint. The proposed algorithm aims for per-iteration minimization of the mean square estimate error. Although our algorithm is sensitive to initialization errors, to the best of our knowledge, the resulting method is the first working solution for arbitrary large dimensions and arbitrary large observation errors while its computation throughput appears insignificant. Numerical simulations show that our method with shuffled datasets can outperform the ordinary least squares method without shuffling. We also consider a batch process to this problem where the datasets are independently available. The solution we propose is independent of initialization but requires that number of such datasets to be at least equal to the dimension of the unknown vector. |
| eu_rights_str_mv | openAccess |
| format | conferenceObject |
| id | LAURepo_1a0e902b26eaa40317a971fbb65329d5 |
| identifier_str_mv | 9781728111513 Saab, S. S., & Saab, K. K. (2019, March). Shuffled Linear Regression with Erroneous Observations. In 2019 53rd Annual Conference on Information Sciences and Systems (CISS) (pp. 1-6). IEEE. |
| language_invalid_str_mv | en |
| network_acronym_str | LAURepo |
| network_name_str | Lebanese American University repository |
| oai_identifier_str | oai:laur.lau.edu.lb:10725/11137 |
| publishDate | 2019 |
| publisher.none.fl_str_mv | IEEE |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| spelling | Shuffled Linear Regression with Erroneous ObservationsSaab, Samer S.Saab, Khaled KamalSaab, Samer S. Jr.Information science -- CongressesTelecommunication systems -- CongressesElectrical engineering -- CongressesInformation theory -- CongressesLinear regression with shuffled labels is the problem of performing a linear regression fit on datasets whose labels are unknowingly shuffled with respect to their inputs. Such a problem relates to different applications such as genome sequence assembly, sampling and reconstruction of spatial fields, and communication networks. Existing methods are either applicable only to data with limited observation errors, work only for partially shuffled data, sensitive to initialization, and/or work only with small dimensions. This paper tackles this problem in its full generality using stochastic approximation, which is based on a first-order permutation-invariant constraint. We propose an optimal recursive algorithm that updates the estimate from the underdetermined function that is based on that permutation-invariant constraint. The proposed algorithm aims for per-iteration minimization of the mean square estimate error. Although our algorithm is sensitive to initialization errors, to the best of our knowledge, the resulting method is the first working solution for arbitrary large dimensions and arbitrary large observation errors while its computation throughput appears insignificant. Numerical simulations show that our method with shuffled datasets can outperform the ordinary least squares method without shuffling. We also consider a batch process to this problem where the datasets are independently available. The solution we propose is independent of initialization but requires that number of such datasets to be at least equal to the dimension of the unknown vector.IEEE Information Theory SocietyN/AIncludes bibliographical references.IEEE2019-07-24T10:48:42Z2019-07-24T10:48:42Z20192019-07-24Conference Paper / Proceedinginfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject9781728111513http://hdl.handle.net/10725/11137http://dx.doi.org/10.1109/CISS.2019.8692838Saab, S. S., & Saab, K. K. (2019, March). Shuffled Linear Regression with Erroneous Observations. In 2019 53rd Annual Conference on Information Sciences and Systems (CISS) (pp. 1-6). IEEE.http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.phphttps://ieeexplore.ieee.org/abstract/document/8692838eninfo:eu-repo/semantics/openAccessoai:laur.lau.edu.lb:10725/111372021-03-19T10:47:35Z |
| spellingShingle | Shuffled Linear Regression with Erroneous Observations Saab, Samer S. Information science -- Congresses Telecommunication systems -- Congresses Electrical engineering -- Congresses Information theory -- Congresses |
| status_str | publishedVersion |
| title | Shuffled Linear Regression with Erroneous Observations |
| title_full | Shuffled Linear Regression with Erroneous Observations |
| title_fullStr | Shuffled Linear Regression with Erroneous Observations |
| title_full_unstemmed | Shuffled Linear Regression with Erroneous Observations |
| title_short | Shuffled Linear Regression with Erroneous Observations |
| title_sort | Shuffled Linear Regression with Erroneous Observations |
| topic | Information science -- Congresses Telecommunication systems -- Congresses Electrical engineering -- Congresses Information theory -- Congresses |
| url | http://hdl.handle.net/10725/11137 http://dx.doi.org/10.1109/CISS.2019.8692838 http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php https://ieeexplore.ieee.org/abstract/document/8692838 |