8 AI agents: The rise of autonomous AI systems

 

This chapter covers

  • Agents and their increasing relevance in AI
  • Training and developing AI agents
  • Present and future applications of AI agents
  • Risks and considerations of using and deploying AI agents

In March 2024, a promotional video from a little-known startup, Cognitive AI, went viral, taking X by storm. It featured the startup’s CEO and “Devin,” described as “the first AI software engineer,” going about its work: planning its solution, browsing the web for API documentation, writing code in a code editor, and executing it from a command line [1]. Cognitive AI claimed a 13.86% solve rate on SWE-bench, a software engineering benchmark designed to use real-world programming tasks, where the previous state-of-the-art had been 4.8%. More visually impactful was the footage of Devin navigating its tools, switching windows in a way that struck a chord with many observers, who felt that they were seeing the future at a time when major model providers were still positioning their products as pair programmers. Autonomous agents—systems that take actions on their own—would not only assist software engineers but could take on entire coding tasks and potentially jobs.

8.1 What is an AI agent?

8.2 How are AI agents being used?

8.2.1 Personal assistants

8.2.2 Enterprise workflows

8.2.3 Research and discovery

8.2.4 Software development

8.2.5 Cybersecurity

8.2.6 Physical environments

8.2.7 Multi-agent systems

8.2.8 Toward agentic collaboration

8.3 How are AI agents trained and enabled?

8.3.1 Agent architectures

8.3.2 Retrieval-augmented generation

8.3.3 Model Context Protocol

8.3.4 GUI-native agents

8.3.5 Evaluating agents

8.4 Risks and considerations unique to agents

8.4.1 Autonomy and misalignment

8.4.2 Memory and state persistence

8.4.3 Tool access and real-world consequences

8.4.4 Emergent behaviors in multi-agent systems

8.4.5 Security and adversarial risks

8.4.6 Human factors and decision delegation

8.4.7 Evaluation, monitoring, and oversight

8.4.8 The road ahead

8.5 The future of AI agents

8.6 Conclusion

8.7 Summary