Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation
<p dir="ltr">This study aims to explore the ability of GPT-4o to imitate the literary style of renowned authors. Ernest Hemingway and Mary Shelley were selected due to their contrasting literary styles and their overall impact on world literature. Using three distinct prompting strat...
محفوظ في:
| المؤلف الرئيسي: | |
|---|---|
| منشور في: |
2025
|
| الموضوعات: | |
| الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| _version_ | 1864513534233149440 |
|---|---|
| author | George Mikros (19197997) |
| author_facet | George Mikros (19197997) |
| author_role | author |
| dc.creator.none.fl_str_mv | George Mikros (19197997) |
| dc.date.none.fl_str_mv | 2025-04-23T09:00:00Z |
| dc.identifier.none.fl_str_mv | 10.1093/llc/fqaf035 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/journal_contribution/Beyond_the_surface_stylometric_analysis_of_GPT-4o_s_capacity_for_literary_style_imitation/30405631 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Information and computing sciences Artificial intelligence Machine learning Language, communication and culture Linguistics stylistic imitation large language models (LLMs) GPT-4o authorship attribution stylometric analysis in-context learning |
| dc.title.none.fl_str_mv | Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation |
| dc.type.none.fl_str_mv | Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal |
| description | <p dir="ltr">This study aims to explore the ability of GPT-4o to imitate the literary style of renowned authors. Ernest Hemingway and Mary Shelley were selected due to their contrasting literary styles and their overall impact on world literature. Using three distinct prompting strategies—zero-shot generation, zero-shot imitation, and in-context learning—we generated forty-five stylistic imitations and analyzed them alongside the authors’ original texts. To ensure thematic consistency, we constrained the generated texts to shared narrative themes derived from the authors’ works. We used a distance-based approach to authorship attribution using the 1,000 most frequent words and cosine distance to explore how the large language model’s imitations were positioned in the multidimensional authorship space. Moreover, we exploited a random forest classifier and repeated the authorship attribution task to analyze the authorship distinctiveness of the GPT imitations further. We used a combination of Textual Complexity and Readability, Author Multilevel N-gram Profiles, Word Embeddings, and Linguistic Inquiry and Word Count features. t-SNE visualizations further evaluated the stylistic alignment between original and GPT-generated texts. The findings reveal that while GPT-4o captures some surface-level stylistic elements of the authors, it struggles to fully replicate the depth and uniqueness of their stylometric signatures. Imitations generated via in-context learning showed improved alignment with the original authors but still exhibited significant overlap with generic GPT outputs.</p><h2>Other Information</h2><p dir="ltr">Published in: Digital Scholarship in the Humanities<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1093/llc/fqaf035" target="_blank">https://dx.doi.org/10.1093/llc/fqaf035</a></p> |
| eu_rights_str_mv | openAccess |
| id | Manara2_16fee8933d9bda8c85e9fea30e7ccde7 |
| identifier_str_mv | 10.1093/llc/fqaf035 |
| network_acronym_str | Manara2 |
| network_name_str | Manara2 |
| oai_identifier_str | oai:figshare.com:article/30405631 |
| publishDate | 2025 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitationGeorge Mikros (19197997)Information and computing sciencesArtificial intelligenceMachine learningLanguage, communication and cultureLinguisticsstylistic imitationlarge language models (LLMs)GPT-4oauthorship attributionstylometric analysisin-context learning<p dir="ltr">This study aims to explore the ability of GPT-4o to imitate the literary style of renowned authors. Ernest Hemingway and Mary Shelley were selected due to their contrasting literary styles and their overall impact on world literature. Using three distinct prompting strategies—zero-shot generation, zero-shot imitation, and in-context learning—we generated forty-five stylistic imitations and analyzed them alongside the authors’ original texts. To ensure thematic consistency, we constrained the generated texts to shared narrative themes derived from the authors’ works. We used a distance-based approach to authorship attribution using the 1,000 most frequent words and cosine distance to explore how the large language model’s imitations were positioned in the multidimensional authorship space. Moreover, we exploited a random forest classifier and repeated the authorship attribution task to analyze the authorship distinctiveness of the GPT imitations further. We used a combination of Textual Complexity and Readability, Author Multilevel N-gram Profiles, Word Embeddings, and Linguistic Inquiry and Word Count features. t-SNE visualizations further evaluated the stylistic alignment between original and GPT-generated texts. The findings reveal that while GPT-4o captures some surface-level stylistic elements of the authors, it struggles to fully replicate the depth and uniqueness of their stylometric signatures. Imitations generated via in-context learning showed improved alignment with the original authors but still exhibited significant overlap with generic GPT outputs.</p><h2>Other Information</h2><p dir="ltr">Published in: Digital Scholarship in the Humanities<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1093/llc/fqaf035" target="_blank">https://dx.doi.org/10.1093/llc/fqaf035</a></p>2025-04-23T09:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1093/llc/fqaf035https://figshare.com/articles/journal_contribution/Beyond_the_surface_stylometric_analysis_of_GPT-4o_s_capacity_for_literary_style_imitation/30405631CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/304056312025-04-23T09:00:00Z |
| spellingShingle | Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation George Mikros (19197997) Information and computing sciences Artificial intelligence Machine learning Language, communication and culture Linguistics stylistic imitation large language models (LLMs) GPT-4o authorship attribution stylometric analysis in-context learning |
| status_str | publishedVersion |
| title | Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation |
| title_full | Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation |
| title_fullStr | Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation |
| title_full_unstemmed | Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation |
| title_short | Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation |
| title_sort | Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation |
| topic | Information and computing sciences Artificial intelligence Machine learning Language, communication and culture Linguistics stylistic imitation large language models (LLMs) GPT-4o authorship attribution stylometric analysis in-context learning |