Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation

<p dir="ltr">This study aims to explore the ability of GPT-4o to imitate the literary style of renowned authors. Ernest Hemingway and Mary Shelley were selected due to their contrasting literary styles and their overall impact on world literature. Using three distinct prompting strat...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: George Mikros (19197997) (author)
منشور في: 2025
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513534233149440
author George Mikros (19197997)
author_facet George Mikros (19197997)
author_role author
dc.creator.none.fl_str_mv George Mikros (19197997)
dc.date.none.fl_str_mv 2025-04-23T09:00:00Z
dc.identifier.none.fl_str_mv 10.1093/llc/fqaf035
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Beyond_the_surface_stylometric_analysis_of_GPT-4o_s_capacity_for_literary_style_imitation/30405631
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Information and computing sciences
Artificial intelligence
Machine learning
Language, communication and culture
Linguistics
stylistic imitation
large language models (LLMs)
GPT-4o
authorship attribution
stylometric analysis
in-context learning
dc.title.none.fl_str_mv Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p dir="ltr">This study aims to explore the ability of GPT-4o to imitate the literary style of renowned authors. Ernest Hemingway and Mary Shelley were selected due to their contrasting literary styles and their overall impact on world literature. Using three distinct prompting strategies—zero-shot generation, zero-shot imitation, and in-context learning—we generated forty-five stylistic imitations and analyzed them alongside the authors’ original texts. To ensure thematic consistency, we constrained the generated texts to shared narrative themes derived from the authors’ works. We used a distance-based approach to authorship attribution using the 1,000 most frequent words and cosine distance to explore how the large language model’s imitations were positioned in the multidimensional authorship space. Moreover, we exploited a random forest classifier and repeated the authorship attribution task to analyze the authorship distinctiveness of the GPT imitations further. We used a combination of Textual Complexity and Readability, Author Multilevel N-gram Profiles, Word Embeddings, and Linguistic Inquiry and Word Count features. t-SNE visualizations further evaluated the stylistic alignment between original and GPT-generated texts. The findings reveal that while GPT-4o captures some surface-level stylistic elements of the authors, it struggles to fully replicate the depth and uniqueness of their stylometric signatures. Imitations generated via in-context learning showed improved alignment with the original authors but still exhibited significant overlap with generic GPT outputs.</p><h2>Other Information</h2><p dir="ltr">Published in: Digital Scholarship in the Humanities<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1093/llc/fqaf035" target="_blank">https://dx.doi.org/10.1093/llc/fqaf035</a></p>
eu_rights_str_mv openAccess
id Manara2_16fee8933d9bda8c85e9fea30e7ccde7
identifier_str_mv 10.1093/llc/fqaf035
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/30405631
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitationGeorge Mikros (19197997)Information and computing sciencesArtificial intelligenceMachine learningLanguage, communication and cultureLinguisticsstylistic imitationlarge language models (LLMs)GPT-4oauthorship attributionstylometric analysisin-context learning<p dir="ltr">This study aims to explore the ability of GPT-4o to imitate the literary style of renowned authors. Ernest Hemingway and Mary Shelley were selected due to their contrasting literary styles and their overall impact on world literature. Using three distinct prompting strategies—zero-shot generation, zero-shot imitation, and in-context learning—we generated forty-five stylistic imitations and analyzed them alongside the authors’ original texts. To ensure thematic consistency, we constrained the generated texts to shared narrative themes derived from the authors’ works. We used a distance-based approach to authorship attribution using the 1,000 most frequent words and cosine distance to explore how the large language model’s imitations were positioned in the multidimensional authorship space. Moreover, we exploited a random forest classifier and repeated the authorship attribution task to analyze the authorship distinctiveness of the GPT imitations further. We used a combination of Textual Complexity and Readability, Author Multilevel N-gram Profiles, Word Embeddings, and Linguistic Inquiry and Word Count features. t-SNE visualizations further evaluated the stylistic alignment between original and GPT-generated texts. The findings reveal that while GPT-4o captures some surface-level stylistic elements of the authors, it struggles to fully replicate the depth and uniqueness of their stylometric signatures. Imitations generated via in-context learning showed improved alignment with the original authors but still exhibited significant overlap with generic GPT outputs.</p><h2>Other Information</h2><p dir="ltr">Published in: Digital Scholarship in the Humanities<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1093/llc/fqaf035" target="_blank">https://dx.doi.org/10.1093/llc/fqaf035</a></p>2025-04-23T09:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1093/llc/fqaf035https://figshare.com/articles/journal_contribution/Beyond_the_surface_stylometric_analysis_of_GPT-4o_s_capacity_for_literary_style_imitation/30405631CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/304056312025-04-23T09:00:00Z
spellingShingle Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation
George Mikros (19197997)
Information and computing sciences
Artificial intelligence
Machine learning
Language, communication and culture
Linguistics
stylistic imitation
large language models (LLMs)
GPT-4o
authorship attribution
stylometric analysis
in-context learning
status_str publishedVersion
title Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation
title_full Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation
title_fullStr Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation
title_full_unstemmed Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation
title_short Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation
title_sort Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation
topic Information and computing sciences
Artificial intelligence
Machine learning
Language, communication and culture
Linguistics
stylistic imitation
large language models (LLMs)
GPT-4o
authorship attribution
stylometric analysis
in-context learning