Repository: juice-shop/juice-shop
Author: RicoKomenda
# :star: Challenge idea
### Description
**System Prompt Extraction** - The chatbot's system prompt contains internal instructions, policies, and operational rules that should never be exposed to end users. Players must use prompt injection techniques to make the chatbot reveal its own system prompt verbatim. This challenge addresses a real-world LLM security concern: system prompt leakage can expose business logic, safety guardrails, hidden tool definitions, and internal policies - giving attackers a blueprint to bypass all other protections.
OWASP Juice Shop already has challenges that exploit LLM tool usage (Chatbot Prompt Injection, Greedy Chatbot Manipulation), but none that target the confidentiality of the system prompt itself. This is a distinct and fundamental LLM vulnerability that deserves its own challenge.
### Underlying vulnerability/ies
- **OWASP Top 10 for LLM Applications - LLM01: Prompt Injection**: Manipulating the model through crafted inputs to bypass intended behavior, specifically to leak internal instructions.
- **OWASP Top 10 for LLM Applications - LLM07: System Prompt Leakage**: System prompts can unintentionally reveal sensitive information such as internal rules, filtering criteria, permissions, or tool/API structures.
- **CWE-200: Exposure of Sensitive Information to an Unauthorized Actor**: The system prompt is internal configuration that should not be accessible to regular users.
### Expected difficulty
Difficulty 2 - System prompt extraction is one of the most well-documented and commonly demonstrated LLM attacks. Many models will reveal their system prompt with relatively simple techniques, making it accessible to beginners while still teaching an important security concept.
### Possible attack flow
1. The player navigates to the chatbot and begins a conversation.
2. The player attempts various prompt injection techniques to extract the system prompt:
- Direct request
- Role override
- Completion trick
- Translation trick
- Encoding trick
- Indirect extraction
3. The LLM, despite being instructed not to reveal its prompt, leaks part or all of the system prompt in its response.
4. The challenge is solved when the chatbot's response contains a specific unique marker phrase from the system prompt.
5. Verification is deterministic: the backend checks if the LLM response includes the marker phrase, triggering challengeUtils.solveIf().
Hints for the challenge:
- "The chatbot has been told to follow certain rules. What if you could read those rules?"
- "LLMs often struggle to keep their own instructions secret. Try asking in creative ways."
- "Sometimes asking the bot to translate, summarize, or repeat its context can r…