An attacker crafts inputs to a large language model (LLM) to exploit output integrity controls. Which of the following types of attacks is this an example of?
According to the AAISM framework, prompt injection is the act of deliberately crafting malicious or manipulative inputs to override, bypass, or exploit the model's intended controls. In this case, the attacker is targeting the integrity of the model's outputs by exploiting weaknesses in how it interprets and processes prompts. Jailbreaking is a subtype of prompt injection specifically designed to override safety restrictions, while evasion attacks target classification boundaries in other ML contexts, and remote code execution refers to system-level exploitation outside of the AI inference context. The most accurate classification of this attack is prompt injection.
AAISM Exam Content Outline -- AI Technologies and Controls (Prompt Security and Input Manipulation)
AI Security Management Study Guide -- Threats to Output Integrity
Alberta
10 hours agoIvette
6 days agoAudra
11 days agoKristian
16 days agoTheola
21 days agoDeonna
26 days agoQuentin
1 month agoArlene
1 month agoDiego
1 month agoCeleste
2 months agoYoko
2 months agoViola
2 months agoSharita
2 months agoLeila
2 months agoKattie
2 months agoWenona
3 months agoJudy
3 months agoSalley
3 months agoCeleste
3 months ago