chapter four

4 Securing GenAI

 

This chapter covers

  • Why LLMs cannot tell data from instructions, and what breaks because of it
  • Real attacks at every capability stage: injection, exfiltration, poisoning, credential theft
  • How each new capability (data access, write access, agency) expands the attack surface
  • Matching controls to exposure: policy engines, tiered approval, sandboxing

In traditional software, the boundary between “what to do” and “what to work with” is enforced through technical rules: specific characters, structured formats, or clearly separated fields. When attackers blur that line (as in SQL injection), engineers can fix it by enforcing those boundaries more strictly: validating inputs, using separate channels for commands and data, or blocking dangerous characters. The fix works because the system follows rigid rules about what counts as a command versus what counts as data.

LLMs don’t follow those rules. They interpret natural language for meaning, not structure. When you send an LLM the instruction “Summarize this email” followed by the email text, both the instruction and the content arrive as plain text. The model decides which part is a command and which part is data based on context and interpretation; not because one is in a special field or uses different characters. There’s no reliable technical boundary to reinforce, which is why the injection defenses that work for databases and web forms don’t translate to language models.

4.1 ClaimAssist

4.2 Risk Factors

4.2.1 ClaimAssist and How Risk Factors Stack

4.3 Insecure Environments

4.3.1 Insufficient Logging

4.3.2 Vendor security gaps

4.3.3 Supply-Chain Risks

4.3.4 Uncontrolled usage and denial of service

4.3.5 Improper secret management

4.3.6 Shadow AI

4.4 Model Risks

4.4.1 Model Weight Risks

4.4.2 Backdoors, Poisoning and Malicious Models

4.4.3 Model Theft

4.4.4 Mitigating Model-Level Risks

4.5 Untrustworthy Input

4.5.1 Beyond Chat-Based Attacks

4.6 Data Security

4.6.1 The Confused Deputy Problem

4.6.2 When Existing Permissions Are Too Broad

4.6.3 Poisoning the Well

4.7 The Ability to Make Changes

4.7.1 Building Controls for AI That Can Act

4.8 Agency

4.8.1 Why Autonomy Changes the Threat Landscape

4.8.2 Agentic Browsers

4.8.3 MCP and the New Tool Ecosystem

4.8.4 Managing What Tools the Agent Can Use

4.8.5 Tool Authorization