AutoAdv Results

<p dir="ltr">Large Language Models (LLMs) are susceptible to jailbreaking attacks, where carefully crafted malicious inputs bypass safety guardrails and provoke harmful responses. We introduce AutoAdv, a novel automated framework that generates adversarial prompts and assesses vulner...

Full description

Saved in:

Bibliographic Details
Main Author:	Aashray Reddy (21450239) (author)
Other Authors:	Andrew Zagula (21450250) (author), Nicholas Saban (21450252) (author)
Published:	2025
Subjects:	Artificial intelligence not elsewhere classified Adversarial Attacks Jailbreaking Attacks Multi-Turn Attacks Automated Prompt Generation LLM Safety
Tags:	Add Tag No Tags, Be the first to tag this record!

AutoAdv Results

Similar Items