chapter eight

8 AI agents: The rise of autonomous AI systems

 

This chapter covers

  • Agents and their increasing relevance in AI
  • Training and developing AI agents
  • Present and future applications of AI agents
  • Risks and considerations of using and deploying AI agents

In March 2024, a promotional video from a little-known startup, Cognitive AI, went viral, taking X by storm. It featured the startup’s CEO and “Devin,” described as “the first AI software engineer,” going about its work: planning its solution, browsing the web for API documentation, writing code in a code editor, and executing it from a command line [1]. Cognitive AI claimed a 13.86% solve rate on SWE-bench, a software engineering benchmark designed to use real-world programming tasks, where the previous state-of-the-art had been 4.8%. More visually impactful was the footage of Devin navigating its tools, switching windows in a way that struck a chord with many observers, who felt that they were seeing the future at a time when major model providers were still positioning their products as pair programmers. Autonomous agents—systems that take actions on their own—would not only assist software engineers but could take on entire coding tasks and potentially jobs.

What is an AI agent?

How are AI agents being used?

Personal assistants

Enterprise workflows

Research and discovery

Software development

Cybersecurity

Physical environments

Multi-agent systems

Toward agentic collaboration

How are AI agents trained and enabled?

Agent architectures

Retrieval-augmented generation

Model Context Protocol

GUI-native agents

Evaluating agents

Risks and considerations unique to agents

Autonomy and misalignment

Memory and state persistence

Tool access and real-world consequences

Emergent behaviors in multi-agent systems

Security and adversarial risks

Human factors and decision delegation

Evaluation, monitoring, and oversight