AutoAdv Results

<p dir="ltr">Large Language Models (LLMs) are susceptible to jailbreaking attacks, where carefully crafted malicious inputs bypass safety guardrails and provoke harmful responses. We introduce AutoAdv, a novel automated framework that generates adversarial prompts and assesses vulner...

Full description

Saved in:
Bibliographic Details
Main Author: Aashray Reddy (21450239) (author)
Other Authors: Andrew Zagula (21450250) (author), Nicholas Saban (21450252) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!