chapter eight

8 The Workflow Service: orchestrating and deploying AI applications

This chapter covers

Building the runtime server that turns decorated functions into HTTP services
Async job storage, polling endpoints, and progress reporting
Deploying with containers for concurrency, automatic retry, and health endpoints
Calling workflows sequentially and in parallel through a single SDK method
The Workflow Service's gRPC contract and the deployment pipeline
Runtime management with Kubernetes

Every platform service we have built so far solves one piece of the AI application puzzle. The Model Service generates responses. The Session Service remembers conversations. The Data Service retrieves organizational knowledge. The Tool Service calls external systems. The Guardrails Service enforces safety policies. The Observability Service tracks costs, latency, and quality. Each service has a gRPC contract, an SDK client, and a clear responsibility. But none of them can turn a developer's code into a running, scalable application that external clients can call.

8.1 Workflow runtime server

8.1.1 The @workflow decorator

8.1.2 From decorated function to HTTP server

8.1.3 Synchronous mode

8.1.4 Streaming mode

8.1.5 Asynchronous mode

8.2 Async job lifecycle

8.2.1 Job storage and the polling endpoint

8.2.2 Progress reporting

8.3 Production readiness: concurrency, reliability, and health

8.3.1 Concurrency inside a single container

8.3.2 Retry at the network layer

8.3.3 Health endpoints

8.4 Workflow composition

8.4.1 Calling other workflows

8.4.2 Parallel workflow calls

8.4.3 Response mode handling

8.5 The Workflow Service contract

8.5.1 Registry operations

8.5.2 Deployment operations

8.6 Deployment pipeline

8.6.1 What genai-platform deploy does

8.6.2 Route registration with the API gateway

8.7 Runtime management

8.8 Summary