Table 4_Multimodal knowledge expansion widget powered by plant protein phosphorylation database and ChatGPT.xlsx

<p>Biological databases are essential for providing curated knowledge, but their rigid data structures and restrictive query formats often limit flexible and exploratory user interactions. In the field of plant phosphorylation, manually curated and reviewed data represent only a small portion...

Full description

Saved in:
Bibliographic Details
Main Author: Chunhui Xu (139181) (author)
Other Authors: Yang Yu (4292) (author), Govardhan Khadakkar (22434757) (author), Jiacheng Xie (14364468) (author), Dong Xu (21616) (author), Qiuming Yao (5815991) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1852015783075381248
author Chunhui Xu (139181)
author2 Yang Yu (4292)
Govardhan Khadakkar (22434757)
Jiacheng Xie (14364468)
Dong Xu (21616)
Qiuming Yao (5815991)
author2_role author
author
author
author
author
author_facet Chunhui Xu (139181)
Yang Yu (4292)
Govardhan Khadakkar (22434757)
Jiacheng Xie (14364468)
Dong Xu (21616)
Qiuming Yao (5815991)
author_role author
dc.creator.none.fl_str_mv Chunhui Xu (139181)
Yang Yu (4292)
Govardhan Khadakkar (22434757)
Jiacheng Xie (14364468)
Dong Xu (21616)
Qiuming Yao (5815991)
dc.date.none.fl_str_mv 2025-10-15T05:25:01Z
dc.identifier.none.fl_str_mv 10.3389/fbinf.2025.1687687.s004
dc.relation.none.fl_str_mv https://figshare.com/articles/dataset/Table_4_Multimodal_knowledge_expansion_widget_powered_by_plant_protein_phosphorylation_database_and_ChatGPT_xlsx/30361882
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Bioinformatics
multimodality
large language mode
plant protein phosphorylation
information retrieva
pathway identification
dc.title.none.fl_str_mv Table 4_Multimodal knowledge expansion widget powered by plant protein phosphorylation database and ChatGPT.xlsx
dc.type.none.fl_str_mv Dataset
info:eu-repo/semantics/publishedVersion
dataset
description <p>Biological databases are essential for providing curated knowledge, but their rigid data structures and restrictive query formats often limit flexible and exploratory user interactions. In the field of plant phosphorylation, manually curated and reviewed data represent only a small portion of the available knowledge, and users often seek information that goes beyond what is provided in structured databases. While large language models (LLMs) like ChatGPT-4o possess extensive contextual knowledge, integrating this capability into bioinformatics tools remains an open challenge. Here, we present a multimodal question-answering widget that integrates ChatGPT-4o with our Plant Protein Phosphorylation Database (P3DB). This system supports natural language queries and dynamic prompt formulation, enabling users to explore phosphorylation events, kinase-substrate relationships, and protein-protein interactions through a global entry. In another application, the widget leverages ChatGPT’s image interpretation functionality to extract regulatory pathways and phosphorylation markers from complex scientific figures. To build this widget effectively, we have explored multiple prompt strategies, including one-step, two-step, few-shot, and image-cropping techniques, demonstrating their impact on output accuracy and consistency. In addition, recent multimodal LLMs such as ChatGPT-5 and Gemini 1.5 have demonstrated comparable capabilities and adaptability when applied to our test cases and the developed widgets. Together, our application widget and results highlight the development of the ChatGPT-P3DB integration as a system that enhances user accessibility, enables visual extraction, and extends the current utility of biological knowledgebases through a flexible and adaptive framework. Our “ChatGPT-P3DB” is open-source and can be accessed on GitHub (https://github.com/yao-laboratory/p3db-chat). The frontend interface, “P3DB askAI” web module, can be accessed freely through https://www.p3db.org/ask-ai.</p>
eu_rights_str_mv openAccess
id Manara_b85e05ff3cfdb94e4f09f62e65fc059b
identifier_str_mv 10.3389/fbinf.2025.1687687.s004
network_acronym_str Manara
network_name_str ManaraRepo
oai_identifier_str oai:figshare.com:article/30361882
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Table 4_Multimodal knowledge expansion widget powered by plant protein phosphorylation database and ChatGPT.xlsxChunhui Xu (139181)Yang Yu (4292)Govardhan Khadakkar (22434757)Jiacheng Xie (14364468)Dong Xu (21616)Qiuming Yao (5815991)Bioinformaticsmultimodalitylarge language modeplant protein phosphorylationinformation retrievapathway identification<p>Biological databases are essential for providing curated knowledge, but their rigid data structures and restrictive query formats often limit flexible and exploratory user interactions. In the field of plant phosphorylation, manually curated and reviewed data represent only a small portion of the available knowledge, and users often seek information that goes beyond what is provided in structured databases. While large language models (LLMs) like ChatGPT-4o possess extensive contextual knowledge, integrating this capability into bioinformatics tools remains an open challenge. Here, we present a multimodal question-answering widget that integrates ChatGPT-4o with our Plant Protein Phosphorylation Database (P3DB). This system supports natural language queries and dynamic prompt formulation, enabling users to explore phosphorylation events, kinase-substrate relationships, and protein-protein interactions through a global entry. In another application, the widget leverages ChatGPT’s image interpretation functionality to extract regulatory pathways and phosphorylation markers from complex scientific figures. To build this widget effectively, we have explored multiple prompt strategies, including one-step, two-step, few-shot, and image-cropping techniques, demonstrating their impact on output accuracy and consistency. In addition, recent multimodal LLMs such as ChatGPT-5 and Gemini 1.5 have demonstrated comparable capabilities and adaptability when applied to our test cases and the developed widgets. Together, our application widget and results highlight the development of the ChatGPT-P3DB integration as a system that enhances user accessibility, enables visual extraction, and extends the current utility of biological knowledgebases through a flexible and adaptive framework. Our “ChatGPT-P3DB” is open-source and can be accessed on GitHub (https://github.com/yao-laboratory/p3db-chat). The frontend interface, “P3DB askAI” web module, can be accessed freely through https://www.p3db.org/ask-ai.</p>2025-10-15T05:25:01ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.3389/fbinf.2025.1687687.s004https://figshare.com/articles/dataset/Table_4_Multimodal_knowledge_expansion_widget_powered_by_plant_protein_phosphorylation_database_and_ChatGPT_xlsx/30361882CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/303618822025-10-15T05:25:01Z
spellingShingle Table 4_Multimodal knowledge expansion widget powered by plant protein phosphorylation database and ChatGPT.xlsx
Chunhui Xu (139181)
Bioinformatics
multimodality
large language mode
plant protein phosphorylation
information retrieva
pathway identification
status_str publishedVersion
title Table 4_Multimodal knowledge expansion widget powered by plant protein phosphorylation database and ChatGPT.xlsx
title_full Table 4_Multimodal knowledge expansion widget powered by plant protein phosphorylation database and ChatGPT.xlsx
title_fullStr Table 4_Multimodal knowledge expansion widget powered by plant protein phosphorylation database and ChatGPT.xlsx
title_full_unstemmed Table 4_Multimodal knowledge expansion widget powered by plant protein phosphorylation database and ChatGPT.xlsx
title_short Table 4_Multimodal knowledge expansion widget powered by plant protein phosphorylation database and ChatGPT.xlsx
title_sort Table 4_Multimodal knowledge expansion widget powered by plant protein phosphorylation database and ChatGPT.xlsx
topic Bioinformatics
multimodality
large language mode
plant protein phosphorylation
information retrieva
pathway identification