JavaScript Abstract Syntax Trees

Defending Against GCG Jailbreak Attacks with Syntax Trees and Perplexity in LLMs

Abstract: In this paper, we propose a novel classification method that utilizes syntax trees and perplexity to identify jailbreak attacks that use hostile suffixes to make large language models (LLMs) ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Defending Against GCG Jailbreak Attacks with Syntax Trees and Perplexity in LLMs

Trending now