Corpus for text classification for domain of knowledge and style of writing.

<p dir="ltr">Recommendation systems (RS) are, of course, the most commonly used to enhance various activities. These systems assist users by offering personalized recommendations based on their interests and requirements. We have developed a system that generates content using web sc...

Full description

Saved in:
Bibliographic Details
Main Author: Alexandr Parahonco (20527484) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1852023818994843648
author Alexandr Parahonco (20527484)
author_facet Alexandr Parahonco (20527484)
author_role author
dc.creator.none.fl_str_mv Alexandr Parahonco (20527484)
dc.date.none.fl_str_mv 2025-01-08T13:10:27Z
dc.identifier.none.fl_str_mv 10.6084/m9.figshare.28163537.v1
dc.relation.none.fl_str_mv https://figshare.com/articles/dataset/Corpus_for_text_classification_for_domain_of_knowledge_and_style_of_writing_/28163537
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Machine learning not elsewhere classified
Corpus linguistics
Recommendation System and Algorithm
NLP
text classification study
quality assessment
method of feature extraction
dc.title.none.fl_str_mv Corpus for text classification for domain of knowledge and style of writing.
dc.type.none.fl_str_mv Dataset
info:eu-repo/semantics/publishedVersion
dataset
description <p dir="ltr">Recommendation systems (RS) are, of course, the most commonly used to enhance various activities. These systems assist users by offering personalized recommendations based on their interests and requirements. We have developed a system that generates content using web scraping and automatic text summarization. In order to select the most valuable text, we need appropriate metrics based on text analysis. Thus, this article proposes a system of indicators to recommend texts for further summarization. It begins with a classification of recommendation systems and a general review of the content generation system.</p><p dir="ltr">The <b>purpose</b> of this study is to develop a recommendation system for content generation, as well as to test the first module of the system – “Context of the Sources”. The <b>object</b> <b>of analysis </b>is a set of algorithms, services, or other software products used to determine a particular user's preferences. The <b>subject of the study</b> is natural language processing (NLP) methods in conjunction with the methods of supervised learning—classification.</p><p dir="ltr">The result of the study is an empirical assessment of the first “Context of the Sources” module of RS, covering academic, security, and non-security domains using a text classification approach. In conclusion, ideas on the results of the experiment and the prospects for implementation of RS are formulated.</p>
eu_rights_str_mv openAccess
id Manara_bbfc0fc589037a67f80fb79942c79bc5
identifier_str_mv 10.6084/m9.figshare.28163537.v1
network_acronym_str Manara
network_name_str ManaraRepo
oai_identifier_str oai:figshare.com:article/28163537
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Corpus for text classification for domain of knowledge and style of writing.Alexandr Parahonco (20527484)Machine learning not elsewhere classifiedCorpus linguisticsRecommendation System and AlgorithmNLPtext classification studyquality assessmentmethod of feature extraction<p dir="ltr">Recommendation systems (RS) are, of course, the most commonly used to enhance various activities. These systems assist users by offering personalized recommendations based on their interests and requirements. We have developed a system that generates content using web scraping and automatic text summarization. In order to select the most valuable text, we need appropriate metrics based on text analysis. Thus, this article proposes a system of indicators to recommend texts for further summarization. It begins with a classification of recommendation systems and a general review of the content generation system.</p><p dir="ltr">The <b>purpose</b> of this study is to develop a recommendation system for content generation, as well as to test the first module of the system – “Context of the Sources”. The <b>object</b> <b>of analysis </b>is a set of algorithms, services, or other software products used to determine a particular user's preferences. The <b>subject of the study</b> is natural language processing (NLP) methods in conjunction with the methods of supervised learning—classification.</p><p dir="ltr">The result of the study is an empirical assessment of the first “Context of the Sources” module of RS, covering academic, security, and non-security domains using a text classification approach. In conclusion, ideas on the results of the experiment and the prospects for implementation of RS are formulated.</p>2025-01-08T13:10:27ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.6084/m9.figshare.28163537.v1https://figshare.com/articles/dataset/Corpus_for_text_classification_for_domain_of_knowledge_and_style_of_writing_/28163537CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/281635372025-01-08T13:10:27Z
spellingShingle Corpus for text classification for domain of knowledge and style of writing.
Alexandr Parahonco (20527484)
Machine learning not elsewhere classified
Corpus linguistics
Recommendation System and Algorithm
NLP
text classification study
quality assessment
method of feature extraction
status_str publishedVersion
title Corpus for text classification for domain of knowledge and style of writing.
title_full Corpus for text classification for domain of knowledge and style of writing.
title_fullStr Corpus for text classification for domain of knowledge and style of writing.
title_full_unstemmed Corpus for text classification for domain of knowledge and style of writing.
title_short Corpus for text classification for domain of knowledge and style of writing.
title_sort Corpus for text classification for domain of knowledge and style of writing.
topic Machine learning not elsewhere classified
Corpus linguistics
Recommendation System and Algorithm
NLP
text classification study
quality assessment
method of feature extraction