Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks

Named Entity Recognition (NER) is a significant information extraction task since it is an important component of many natu ral language processing applications, such as Information Retrieval, Question Answering and Speech Recognition. The complex ity and morphological richness of the Arabic languag...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Khalifaa , Muhammad (author)
مؤلفون آخرون: Shaalan, Khaled (author)
منشور في: 2019
الموضوعات:
الوصول للمادة أونلاين:https://bspace.buid.ac.ae/handle/1234/2785
https://doi.org/10.1016/j.csl.2019.05.003.
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1862980614668419072
author Khalifaa , Muhammad
author2 Shaalan, Khaled
author2_role author
author_facet Khalifaa , Muhammad
Shaalan, Khaled
author_role author
dc.creator.none.fl_str_mv Khalifaa , Muhammad
Shaalan, Khaled
dc.date.none.fl_str_mv 2019
2025-02-10T05:32:28Z
2025-02-10T05:32:28Z
dc.identifier.none.fl_str_mv Khalifa, M. and Shaalan, K. (2019) “Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks,” Computer Speech & Language, 58, pp. 335–346.
0885-2308
https://bspace.buid.ac.ae/handle/1234/2785
https://doi.org/10.1016/j.csl.2019.05.003.
dc.language.none.fl_str_mv en
dc.publisher.none.fl_str_mv Elsevier
dc.relation.none.fl_str_mv Computer Speech & Languagev58 (201911): 335-346
dc.subject.none.fl_str_mv Named Entity Recognition; Arabic; Recurrent Neural Network; LSTM; Convolutional Neural Network
dc.title.none.fl_str_mv Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks
dc.type.none.fl_str_mv Article
description Named Entity Recognition (NER) is a significant information extraction task since it is an important component of many natu ral language processing applications, such as Information Retrieval, Question Answering and Speech Recognition. The complex ity and morphological richness of the Arabic language is the main reason why most existing Arabic NER systems rely strongly on hand-crafted feature engineering. In this paper, we propose to augment the existing LSTM neural tagging model for Arabic NER with a Convolutional Neural Network (CNN) for the extraction of relevant character-level features. By operating on the charac ter-level, the proposed model is able to handle out-of-vocabulary words. Our results show that character CNN is able to outper form the previously used character-level Bi-directional Long Short-Term Memory Networks (BiLSTM) in many settings. Moreover, our observations indicate that CNNs tend to perform better than BiLSTM on relatively longer tokens. In addition, we conduct a comparison of four different pre-trained word vector models for Arabic NER and results show that a Skip-Gram Word2- vec model, pre-trained on a subset of the Arabic Gigaword corpus, is generally sufficient to obtain acceptable Arabic NER performance
id budr_56160f149245f13ae3326fa445c2a5f4
identifier_str_mv Khalifa, M. and Shaalan, K. (2019) “Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks,” Computer Speech & Language, 58, pp. 335–346.
0885-2308
language_invalid_str_mv en
network_acronym_str budr
network_name_str The British University in Dubai repository
oai_identifier_str oai:bspace.buid.ac.ae:1234/2785
publishDate 2019
publisher.none.fl_str_mv Elsevier
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory NetworksKhalifaa , MuhammadShaalan, KhaledNamed Entity Recognition; Arabic; Recurrent Neural Network; LSTM; Convolutional Neural NetworkNamed Entity Recognition (NER) is a significant information extraction task since it is an important component of many natu ral language processing applications, such as Information Retrieval, Question Answering and Speech Recognition. The complex ity and morphological richness of the Arabic language is the main reason why most existing Arabic NER systems rely strongly on hand-crafted feature engineering. In this paper, we propose to augment the existing LSTM neural tagging model for Arabic NER with a Convolutional Neural Network (CNN) for the extraction of relevant character-level features. By operating on the charac ter-level, the proposed model is able to handle out-of-vocabulary words. Our results show that character CNN is able to outper form the previously used character-level Bi-directional Long Short-Term Memory Networks (BiLSTM) in many settings. Moreover, our observations indicate that CNNs tend to perform better than BiLSTM on relatively longer tokens. In addition, we conduct a comparison of four different pre-trained word vector models for Arabic NER and results show that a Skip-Gram Word2- vec model, pre-trained on a subset of the Arabic Gigaword corpus, is generally sufficient to obtain acceptable Arabic NER performanceElsevier2025-02-10T05:32:28Z2025-02-10T05:32:28Z2019ArticleKhalifa, M. and Shaalan, K. (2019) “Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks,” Computer Speech & Language, 58, pp. 335–346.0885-2308https://bspace.buid.ac.ae/handle/1234/2785https://doi.org/10.1016/j.csl.2019.05.003.enComputer Speech & Languagev58 (201911): 335-346oai:bspace.buid.ac.ae:1234/27852026-01-29T15:02:51Z
spellingShingle Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks
Khalifaa , Muhammad
Named Entity Recognition; Arabic; Recurrent Neural Network; LSTM; Convolutional Neural Network
title Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks
title_full Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks
title_fullStr Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks
title_full_unstemmed Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks
title_short Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks
title_sort Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks
topic Named Entity Recognition; Arabic; Recurrent Neural Network; LSTM; Convolutional Neural Network
url https://bspace.buid.ac.ae/handle/1234/2785
https://doi.org/10.1016/j.csl.2019.05.003.