Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images

<p>Convolutional neural networks (CNNs) have previously been broadly utilized to binarize document images. These methods have problems when faced with degraded historical documents. This paper proposes the utilization of CNNs to identify foreground pixels using novel input-generated multichann...

Full description

Saved in:
Bibliographic Details
Main Author: Younes Akbari (16303286) (author)
Other Authors: Somaya Al-Maadeed (5178131) (author), Kalthoum Adam (16870119) (author)
Published: 2020
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1864513561253904384
author Younes Akbari (16303286)
author2 Somaya Al-Maadeed (5178131)
Kalthoum Adam (16870119)
author2_role author
author
author_facet Younes Akbari (16303286)
Somaya Al-Maadeed (5178131)
Kalthoum Adam (16870119)
author_role author
dc.creator.none.fl_str_mv Younes Akbari (16303286)
Somaya Al-Maadeed (5178131)
Kalthoum Adam (16870119)
dc.date.none.fl_str_mv 2020-08-19T00:00:00Z
dc.identifier.none.fl_str_mv 10.1109/access.2020.3017783
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Binarization_of_Degraded_Document_Images_Using_Convolutional_Neural_Networks_and_Wavelet-Based_Multichannel_Images/24016176
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Information and computing sciences
Computer vision and multimedia computation
Data management and data science
Machine learning
Wavelet transforms
Image segmentation
Deep learning
Gray-scale
Databases
Text analysis
Semantics
Document image binarization
Wavelet-based multichannel images
Single and multiple CNNs
SegNet
U-net
DeepLabv3+
dc.title.none.fl_str_mv Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p>Convolutional neural networks (CNNs) have previously been broadly utilized to binarize document images. These methods have problems when faced with degraded historical documents. This paper proposes the utilization of CNNs to identify foreground pixels using novel input-generated multichannel images. To create the images, the original source image is decomposed into wavelet subbands. Then, the original image is approximated by each subband separately, and finally, the multichannel image is constituted by arranging the original source image (grayscale image) as the first channel and the approximated image by each subband as the remaining channels. To achieve the best results, two scenarios are considered, that is, two-channel and four-channel images, and then fed into two types of CNN architectures, namely, single and multiple streams. To investigate the effect of the multichannel images proposed as network inputs, the CNNs used in the architectures are three popular networks, namely, U-net, SegNet, and DeepLabv3+. The experimental results of the scenarios demonstrate that our method is more successful than the three CNNs when trained by the original source images and proves competitive performance in comparison with state-of-the-art results using the DIBCO database.</p><h2>Other Information</h2><p>Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/legalcode" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2020.3017783" target="_blank">https://dx.doi.org/10.1109/access.2020.3017783</a></p>
eu_rights_str_mv openAccess
id Manara2_dd34923b375a1ccd230168df98e0a838
identifier_str_mv 10.1109/access.2020.3017783
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/24016176
publishDate 2020
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel ImagesYounes Akbari (16303286)Somaya Al-Maadeed (5178131)Kalthoum Adam (16870119)Information and computing sciencesComputer vision and multimedia computationData management and data scienceMachine learningWavelet transformsImage segmentationDeep learningGray-scaleDatabasesText analysisSemanticsDocument image binarizationWavelet-based multichannel imagesSingle and multiple CNNsSegNetU-netDeepLabv3+<p>Convolutional neural networks (CNNs) have previously been broadly utilized to binarize document images. These methods have problems when faced with degraded historical documents. This paper proposes the utilization of CNNs to identify foreground pixels using novel input-generated multichannel images. To create the images, the original source image is decomposed into wavelet subbands. Then, the original image is approximated by each subband separately, and finally, the multichannel image is constituted by arranging the original source image (grayscale image) as the first channel and the approximated image by each subband as the remaining channels. To achieve the best results, two scenarios are considered, that is, two-channel and four-channel images, and then fed into two types of CNN architectures, namely, single and multiple streams. To investigate the effect of the multichannel images proposed as network inputs, the CNNs used in the architectures are three popular networks, namely, U-net, SegNet, and DeepLabv3+. The experimental results of the scenarios demonstrate that our method is more successful than the three CNNs when trained by the original source images and proves competitive performance in comparison with state-of-the-art results using the DIBCO database.</p><h2>Other Information</h2><p>Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/legalcode" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2020.3017783" target="_blank">https://dx.doi.org/10.1109/access.2020.3017783</a></p>2020-08-19T00:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1109/access.2020.3017783https://figshare.com/articles/journal_contribution/Binarization_of_Degraded_Document_Images_Using_Convolutional_Neural_Networks_and_Wavelet-Based_Multichannel_Images/24016176CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/240161762020-08-19T00:00:00Z
spellingShingle Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
Younes Akbari (16303286)
Information and computing sciences
Computer vision and multimedia computation
Data management and data science
Machine learning
Wavelet transforms
Image segmentation
Deep learning
Gray-scale
Databases
Text analysis
Semantics
Document image binarization
Wavelet-based multichannel images
Single and multiple CNNs
SegNet
U-net
DeepLabv3+
status_str publishedVersion
title Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
title_full Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
title_fullStr Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
title_full_unstemmed Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
title_short Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
title_sort Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
topic Information and computing sciences
Computer vision and multimedia computation
Data management and data science
Machine learning
Wavelet transforms
Image segmentation
Deep learning
Gray-scale
Databases
Text analysis
Semantics
Document image binarization
Wavelet-based multichannel images
Single and multiple CNNs
SegNet
U-net
DeepLabv3+