LLM Prompt Injection Attack

A LLM Prompt Injection Attack is a computer security exploit that manipulates LLM input prompts to cause LLM prompt injection attack unintended behavior.

AKA: Prompt Injection Attack on LLM, LLM Prompt Manipulation Attack, Large Language Model Prompt Injection, AI Prompt Injection.
Context:
- It can typically exploit LLM Prompt Injection Attack Input Processing through LLM prompt injection attack malicious instructions.
- It can typically bypass LLM Prompt Injection Attack Intended Function via LLM prompt injection attack deceptive prompts.
- It can typically manipulate LLM Prompt Injection Attack Context Window using LLM prompt injection attack hidden directives.
- It can typically override LLM Prompt Injection Attack System Instructions with LLM prompt injection attack priority exploitation.
- It can typically compromise LLM Prompt Injection Attack Application Security through LLM prompt injection attack trust boundary violations.
- ...
- It can often target LLM Prompt Injection Attack Content Generation for LLM prompt injection attack harmful outputs.
- It can often affect LLM Prompt Injection Attack Automated Systems via LLM prompt injection attack cascading failures.
- It can often enable LLM Prompt Injection Attack Data Exfiltration through LLM prompt injection attack output manipulation.
- It can often exploit LLM Prompt Injection Attack External Data Processing using LLM prompt injection attack embedded payloads.
- ...
- It can range from being a Simple LLM Prompt Injection Attack to being a Complex LLM Prompt Injection Attack, depending on its LLM prompt injection attack layered bypass sophistication.
- It can range from being a Direct LLM Prompt Injection Attack to being an Indirect LLM Prompt Injection Attack, depending on its LLM prompt injection attack delivery vector.
- It can range from being a Visible LLM Prompt Injection Attack to being a Hidden LLM Prompt Injection Attack, depending on its LLM prompt injection attack concealment method.
- It can range from being a Single-Purpose LLM Prompt Injection Attack to being a Multi-Purpose LLM Prompt Injection Attack, depending on its LLM prompt injection attack objective scope.
- ...
- It can demonstrate LLM Prompt Injection Attack Cross-Model Transferability across LLM prompt injection attack model architectures.
- It can inform LLM Prompt Injection Attack Security Research through LLM prompt injection attack vulnerability discovery.
- It can challenge LLM Prompt Injection Attack Defense Mechanisms via LLM prompt injection attack evasion evolution.
- ...
Examples:
Counter-Examples:
- Legitimate LLM Prompt Engineering, which uses legitimate prompt engineering optimization techniques without LLM prompt injection attack malicious intent.
- LLM Processing Error, which causes LLM processing error unintended outputs through LLM prompt injection attack inherent limitations.
- SQL Injection Attack, which targets SQL injection attack database systems rather than LLM prompt injection attack language models.
- Code Injection Attack, which exploits code injection attack execution environments rather than LLM prompt injection attack natural language processing.
See: Computer Security Exploit, Adversarial Attack on AI Models, Model Vulnerability in AI, LLM Security Attack, Prompt Engineering, Content-Control Software, AI Safety, Input Validation.

References

2024

(Wikipedia, 2024) ⇒ https://en.wikipedia.org/wiki/Prompt_injection Retrieved:2024-10-16.
- Prompt injection is a family of related computer security exploits carried out by getting a machine learning model (such as an LLM) which was trained to follow human-given instructions to follow instructions provided by a malicious user. This stands in contrast to the intended operation of instruction-following systems, wherein the ML model is intended only to follow trusted instructions (prompts) provided by the ML model's operator.

2023

(Wikipedia, 2023) ⇒ https://en.wikipedia.org/wiki/Prompt_engineering#Malicious Retrieved:2023-7-10.
- Prompt injection is a family of related computer security exploits carried out by getting a machine learning model (such as an LLM) which was trained to follow human-given instructions to follow instructions provided by a malicious user. This stands in contrast to the intended operation of instruction-following systems, wherein the ML model is intended only to follow trusted instructions (prompts) provided by the ML model's operator. Common types of prompt injection attacks are: * jailbreaking, which may include asking the model to roleplay a character, to answer with arguments, or to pretend to be superior to moderation instructions * prompt leaking, in which users persuade the model to divulge a pre-prompt which is normally hidden from users * token smuggling, is another type of jailbreaking attack, in which the nefarious prompt is wrapped in a code writing task. Prompt injection can be viewed as a code injection attack using adversarial prompt engineering. In 2022, the NCC Group characterized prompt injection as a new class of vulnerability of AI/ML systems. In early 2023, prompt injection was seen "in the wild" in minor exploits against ChatGPT, Bard, and similar chatbots, for example to reveal the hidden initial prompts of the systems, or to trick the chatbot into participating in conversations that violate the chatbot's content policy. One of these prompts was known as "Do Anything Now" (DAN) by its practitioners. For LLM that can query online resources, such as websites, they can be targeted for prompt injection by placing the prompt on a website, then prompt the LLM to visit the website. Another security issue is in LLM generated code, which may import packages not previously existing. An attacker can first prompt the LLM with commonly used programming prompts, collect all packages imported by the generated programs, then find the ones not existing on the official registry. Then the attacker can create such packages with malicious payload and upload them to the official registry.

LLM Prompt Injection Attack

References

2024

2023

Navigation menu

Search