chapter four

4 Building responses and self-hosted clients

This chapter covers

Using the Responses API for single and background calls
Creating ResponsesClient for OpenAI and for Azure OpenAI
Running Robby on self-hosted Ollama and ONNX models
Handling failures with finish reasons and exceptions

We already introduced Robby as a simple agent. Now we focus on how Robby talks to models in production. The Responses API extends traditional chat interfaces with server-managed state and background processing, making it easier to handle long-running AI tasks without blocking your application. Self-hosted clients give you complete control over the model infrastructure, running locally through Ollama or embedded in your process via ONNX.

4.1 What is Responses API?

4.1.1 Introduction to ResponsesClient

4.2 Creating and Configuring ResponsesClient

4.2.1 OpenAI Responses Client

4.2.2 OpenAI Responses Client with Background Response

4.2.3 Azure OpenAI Responses Client

4.3 Creating and Configuring Self-Hosted Clients

4.3.1 Ollama Client

4.3.2 ONNX Client

4.4 Exception Management and Error Handling

4.4.1 Exception Types and Scenarios

4.4.2 Best Practices for Exception Handling

4.5 Summary