https://viewer.diagrams.net/?border=0&lightbox=1&highlight=0000ff&nav=1&title=OnPrem&dark=0#Uhttps%3A%2F%2Fdrive.google.com%2Fuc%3Fid%3D1mjKDG5a0TlUc7b-azFZgBiaeSuuZ2Lxi%26export%3Ddownload
Control plane and data plane separation: xpander Cloud (A) is the control plane. It owns environment metadata, registration, and event logging. The AI engineer uses the AI Agent Workbench in (A) to define agents, configure connectors, and validate tasks. Connectivity from the customer VPC to (A) is outbound only over 443 to the Global Accelerator IPs, which satisfies the “no inbound from Cloud” requirement while keeping runtime data inside the customer VPC.
A user initiates a task from Slack, Teams, or WebUI in line 1, or from an MCP client in line 1.1. Requests from agent clients are routed through ingress to the Agent Controller in the customer VPC (F) in line 2, while MCP clients connect directly to the MCP Server (G) and the MCP Server is the one that generates the task on the Agent Controller (F). The controller handles task CRUD operations, registering and scheduling each task in a queue, before the AI Agents can process them. Execution is then consumed by Agent Workers (B) or Serverless Nodes (B) via the @on_task decorator or automatically via the serverless agents.
During execution, the runtime retrieves persistent agent memory from Postgres (4) and performs fast lookups or writes through Redis (C), with cache operations highlighted in line 4.1. Using the task payload and tool schema, the agent consults the LLM (H) to predict the next conversational turn. Each turn is appended to a messages array and persisted in Postgres (4) for monitoring and observability.
When the LLM (H) issues a function call through its special annotation scheme, the payload is handed to the AI Gateway (E) in line 7 as “run function X with payload Y.” The gateway enforces permissions, injects secrets, and augments the call with context from Postgres (4) or Redis (C) before execution. Downstream targets such as GitHub, Grafana, or PowerBI (D) are accessed exclusively through the AI Gateway (E) in line 8, never directly from the agents themselves.
The Agent Controller (F) performs outbound synchronization with xpander Cloud (A) in line 9, to ensure metadata, environment heartbeats, connector schema updates, and event logs are continuously aligned. All agent and connector definitions are represented as JSON documents and JSON-based metadata.
Agent definitions and connector configs are authored in the Workbench in (A). When saved, the metadata is registered in (A) and the environment shows as connected. During deployment, Helm installs the self-hosted stack that exposes agent-controller.{domain}, ai-gateway.{domain}, and mcp.{domain}. The components are Agent Controller (F), AI Gateway (E), MCP Server (G), Agent Workers and Serverless Nodes (B), Postgres (4) for agent memory, and Redis (C) for cache. Secrets for model providers are stored as Kubernetes secrets and only used inside the VPC. After deployment, the control plane in (A) remains the source of truth for registrations and events, but all execution moves to the customer VPC.
Traffic from the cluster to xpander Cloud (A) is egress only on 443 to Global Accelerator OR Private link. Global Accelerator IP is 15.197.85.80 and 166.117.85.46. Private link id is: com.amazonaws.vpce.us-west-2.vpce-svc-0101884b32f655197
No inbound from (A) into the VPC. Data at rest lives in Postgres (4) and Redis (C) in the customer VPC. Connector credentials and model API keys are Kubernetes secrets used by E and B only. Control plane telemetry and registrations flow on line 9, not user payloads.
xpander's deployment manager is a multi-tenant microservice that handles all the external communication from self-deployed environments to xpander-cloud the service is authenticated using api key and sets of organization_id+environment_id+api_key every environment reports heartbeats and asks for "commands for execution" such as: connector details, agent details, mcp details