Prompt Injection in RAG Pipelines Is the New Supply Chain Risk
Untrusted documents can hijack model behavior. We outline trust boundaries, safe retrieval patterns, and guardrails that actually work in production.
Research notes, announcements, and security insights from the FortySecurity team.
Focused guidance on securing modern AI systems, from RAG to tool-using agents.
Untrusted documents can hijack model behavior. We outline trust boundaries, safe retrieval patterns, and guardrails that actually work in production.
A clear checklist for bounding tool permissions, validating tool outputs, and preventing multi-step action chaining.
Prevent sensitive prompts, tokens, and embeddings from escaping into logs, traces, or third-party tools used by AI teams.
RAG systems introduce a second supply chain: the document corpus. If the corpus is untrusted or easily poisoned, an attacker can insert instructions that override your system prompt and trigger unsafe actions. This is not a theoretical edge case; we see it in customer deployments across internal wikis, PDFs, and shared knowledge bases.
What changes in RAG? The model now treats retrieved content as authoritative context. Without strict boundaries, it will follow malicious instructions hidden inside trusted documents.
We recommend treating every retrieved document as untrusted input, even if it came from “internal” sources. Attackers only need one weak link — a stale document, an unreviewed upload, or a compromised wiki page — to poison the context.
Most teams already do some of this, but the key is consistency. Make guardrails part of the RAG pipeline, not an afterthought in the UI layer.
Agentic systems amplify risk because a single prompt can trigger multiple tool calls. When tools are overly permissive, the model can be tricked into taking real-world actions such as reading sensitive files, making unintended API calls, or leaking data through indirect channels.
We’ve seen successful prompt injection attacks where the model was tricked into “helpfully” exporting data, granting extra permissions, or executing actions outside of the user’s intent.
Well-designed guardrails reduce the attack surface without sacrificing the workflow benefits of agentic systems. The goal is not to block agents — it’s to ensure every action is accountable and intentional.
LLM systems often collect telemetry for observability. The problem: prompt and completion logs can accidentally store secrets, credentials, or sensitive business data. Once it hits a log or a trace, it’s very difficult to control downstream exposure.
We’ve seen incidents where secrets were leaked into monitoring dashboards, customer support tools, or long-term storage simply because logs were never scrubbed.
Security and observability can coexist, but only with strict data boundaries and enforcement. Treat LLM logs like sensitive production data — because they usually are.