chapter one

1 AI Engineering - The Blueprint

This chapter covers

Distinguishing AI Engineering from Prompt Engineering.
Evaluating when simple prompting fails and engineering is needed for scale, integration, or managing risk.
Understanding the five architectural layers—routing, RAG, prompts, agents, and infrastructure.
Following a customer query through a production pipeline to observe routing, retrieval, and validation in action.
Diagnosing failure symptoms, like hallucinations or cost overruns.

In February 2024, Air Canada's AI-powered customer service chatbot told a passenger he could request a bereavement fare refund after travel, a policy that did not exist. The airline was ordered by a Canadian tribunal to honor the misleading guidance and issued a partial refund. The incident was not caused by the model itself but by the absence of disciplined AI engineering practices. Developers had focused on conversational design and prompt tuning, and the chatbot worked smoothly in demonstrations. But production AI requires architecture, not just better prompts. Reliable systems demand automated evaluation pipelines, input validation, hallucination detection, and continuous monitoring. They must also balance competing constraints: response quality, latency, cost per query, and throughput at scale.

1.1 What is AI Engineering?

1.1.1 From Prompts to Production Systems

1.1.2 When You Need AI Engineering vs. Simple Prompts

1.2 Why AI Engineering Delivers Results

1.2.1 Customer Support at Scale

1.2.2 Document Intelligence in Legal Services

1.2.3 Workflow Automation in Operations

1.3 The Blueprint: How Production AI Systems Work

1.3.1 Complete System Architecture

1.3.2 Following the Transaction

1.3.3 The Five Engineering Layers

1.3.4 Diagnosing System Failures

1.4 What You're Building Toward

1.5 Summary