Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
<p>Convolutional neural networks (CNNs) have previously been broadly utilized to binarize document images. These methods have problems when faced with degraded historical documents. This paper proposes the utilization of CNNs to identify foreground pixels using novel input-generated multichann...
Saved in:
| Main Author: | |
|---|---|
| Other Authors: | , |
| Published: |
2020
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1864513561253904384 |
|---|---|
| author | Younes Akbari (16303286) |
| author2 | Somaya Al-Maadeed (5178131) Kalthoum Adam (16870119) |
| author2_role | author author |
| author_facet | Younes Akbari (16303286) Somaya Al-Maadeed (5178131) Kalthoum Adam (16870119) |
| author_role | author |
| dc.creator.none.fl_str_mv | Younes Akbari (16303286) Somaya Al-Maadeed (5178131) Kalthoum Adam (16870119) |
| dc.date.none.fl_str_mv | 2020-08-19T00:00:00Z |
| dc.identifier.none.fl_str_mv | 10.1109/access.2020.3017783 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/journal_contribution/Binarization_of_Degraded_Document_Images_Using_Convolutional_Neural_Networks_and_Wavelet-Based_Multichannel_Images/24016176 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Information and computing sciences Computer vision and multimedia computation Data management and data science Machine learning Wavelet transforms Image segmentation Deep learning Gray-scale Databases Text analysis Semantics Document image binarization Wavelet-based multichannel images Single and multiple CNNs SegNet U-net DeepLabv3+ |
| dc.title.none.fl_str_mv | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images |
| dc.type.none.fl_str_mv | Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal |
| description | <p>Convolutional neural networks (CNNs) have previously been broadly utilized to binarize document images. These methods have problems when faced with degraded historical documents. This paper proposes the utilization of CNNs to identify foreground pixels using novel input-generated multichannel images. To create the images, the original source image is decomposed into wavelet subbands. Then, the original image is approximated by each subband separately, and finally, the multichannel image is constituted by arranging the original source image (grayscale image) as the first channel and the approximated image by each subband as the remaining channels. To achieve the best results, two scenarios are considered, that is, two-channel and four-channel images, and then fed into two types of CNN architectures, namely, single and multiple streams. To investigate the effect of the multichannel images proposed as network inputs, the CNNs used in the architectures are three popular networks, namely, U-net, SegNet, and DeepLabv3+. The experimental results of the scenarios demonstrate that our method is more successful than the three CNNs when trained by the original source images and proves competitive performance in comparison with state-of-the-art results using the DIBCO database.</p><h2>Other Information</h2><p>Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/legalcode" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2020.3017783" target="_blank">https://dx.doi.org/10.1109/access.2020.3017783</a></p> |
| eu_rights_str_mv | openAccess |
| id | Manara2_dd34923b375a1ccd230168df98e0a838 |
| identifier_str_mv | 10.1109/access.2020.3017783 |
| network_acronym_str | Manara2 |
| network_name_str | Manara2 |
| oai_identifier_str | oai:figshare.com:article/24016176 |
| publishDate | 2020 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel ImagesYounes Akbari (16303286)Somaya Al-Maadeed (5178131)Kalthoum Adam (16870119)Information and computing sciencesComputer vision and multimedia computationData management and data scienceMachine learningWavelet transformsImage segmentationDeep learningGray-scaleDatabasesText analysisSemanticsDocument image binarizationWavelet-based multichannel imagesSingle and multiple CNNsSegNetU-netDeepLabv3+<p>Convolutional neural networks (CNNs) have previously been broadly utilized to binarize document images. These methods have problems when faced with degraded historical documents. This paper proposes the utilization of CNNs to identify foreground pixels using novel input-generated multichannel images. To create the images, the original source image is decomposed into wavelet subbands. Then, the original image is approximated by each subband separately, and finally, the multichannel image is constituted by arranging the original source image (grayscale image) as the first channel and the approximated image by each subband as the remaining channels. To achieve the best results, two scenarios are considered, that is, two-channel and four-channel images, and then fed into two types of CNN architectures, namely, single and multiple streams. To investigate the effect of the multichannel images proposed as network inputs, the CNNs used in the architectures are three popular networks, namely, U-net, SegNet, and DeepLabv3+. The experimental results of the scenarios demonstrate that our method is more successful than the three CNNs when trained by the original source images and proves competitive performance in comparison with state-of-the-art results using the DIBCO database.</p><h2>Other Information</h2><p>Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/legalcode" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2020.3017783" target="_blank">https://dx.doi.org/10.1109/access.2020.3017783</a></p>2020-08-19T00:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1109/access.2020.3017783https://figshare.com/articles/journal_contribution/Binarization_of_Degraded_Document_Images_Using_Convolutional_Neural_Networks_and_Wavelet-Based_Multichannel_Images/24016176CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/240161762020-08-19T00:00:00Z |
| spellingShingle | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images Younes Akbari (16303286) Information and computing sciences Computer vision and multimedia computation Data management and data science Machine learning Wavelet transforms Image segmentation Deep learning Gray-scale Databases Text analysis Semantics Document image binarization Wavelet-based multichannel images Single and multiple CNNs SegNet U-net DeepLabv3+ |
| status_str | publishedVersion |
| title | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images |
| title_full | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images |
| title_fullStr | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images |
| title_full_unstemmed | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images |
| title_short | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images |
| title_sort | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images |
| topic | Information and computing sciences Computer vision and multimedia computation Data management and data science Machine learning Wavelet transforms Image segmentation Deep learning Gray-scale Databases Text analysis Semantics Document image binarization Wavelet-based multichannel images Single and multiple CNNs SegNet U-net DeepLabv3+ |