Kubernetes Agent

k8s-autopilot is an intelligent, multi-agent framework that automates the complete lifecycle of Kubernetes operations. Built on LangChain and LangGraph, it acts as a unified platform for Helm Chart Generation, Active Cluster Management, and ArgoCD onboarding operations with human-in-the-loop safety.

Status

Aspect	Details
Status	✅ Ready
Version	v0.3.0
Maintained By	TalkOps Team
MCP Server	Helm MCP, ArgoCD MCP
Protocol	Google A2A Protocol
Framework	LangGraph Multi-Agent
GitHub	talkops-ai/k8s-autopilot

Key Features

Feature	Description
📋 Planning	Analyzes requirements, validates completeness, and designs Kubernetes architecture
⚙️ Generation	Generates Helm templates, values files, and documentation
✅ Validation	Validates charts, performs security scanning, and ensures production readiness
🎮 Active Management	Installs, upgrades, and rolls back Helm releases on active clusters
🧭 ArgoCD Operations	Manage ArgoCD projects, repositories, and applications with previews and approvals
🔄 Self-Healing	Automatically fixes common errors (YAML indentation, deprecated APIs)
👤 Human-in-the-Loop	Requests approvals at critical workflow points for safety

Architecture

The Kubernetes Agent uses a Supervisor-Worker architecture with specialized deep agent swarms.

Agent Components

Agent	Role
Supervisor Agent	Central orchestrator, manages state flow, HITL gates, and delegation
Planner Agent	Requirement analysis, gap detection, and architecture planning
Template Coordinator	Generates Helm chart templates, values.yaml, and READMEs
Generator (Validator)	Validates charts, self-heals errors, and ensures production readiness
Helm Management Agent	Specialized agent for active cluster operations via MCP
ArgoCD Onboarding Orchestrator	Routes ArgoCD workflows and coordinates plan/approval gates
Project Agent	ArgoCD project CRUD
Repository Agent	ArgoCD repository list/get/onboard/delete
Application Agent	ArgoCD app lifecycle and sync operations
Debug Agent	ArgoCD logs/events collection

Workflows

The agent supports three primary workflows managed by the Supervisor.

1. Helm Chart Generation

2. Helm Management (Active Cluster)

3. ArgoCD Onboarding Operations

Use Cases

Use Case	Description
🏗️ Chart Generation	Create production-ready Helm charts from natural language descriptions
🚀 App Deployment	Deploy complex applications to Kubernetes with best-practice configurations
🔄 Lifecycle Management	Upgrade and rollback Helm releases safely with state awareness
🧭 ArgoCD Onboarding	Onboard projects, repos, and apps with approvals and previews
🔍 Cluster Discovery	Query and inspect existing releases and cluster resources
🧪 ArgoCD Troubleshooting	Fetch ArgoCD app logs and events when debugging

Key Benefits

Benefit	Details
Modular Swarm Design	Independently deployable and scalable agent swarms
Stateful Orchestration	LangGraph-based state management with checkpoints
Tool-Based Delegation	Dynamic routing to specialized agents
Human-Centric Safety	Strict HITL gates for all state-changing operations
Self-Healing Capabilities	Autonomous fixing of common YAML and configuration errors
Fresh-State Decisions	Reads live cluster/ArgoCD state via MCP before acting

Prerequisites

Requirement	Details
Python	3.12+
Helm CLI	Required for local validation
Kubernetes	Active cluster (for management features)
ArgoCD MCP Server	Required for ArgoCD operations (credentials managed server-side)
LLM API Key	OpenAI, Anthropic, or others
TalkOps Client	For interaction

Configuration

LLM Provider (.env)

LLM_PROVIDER="openai"
LLM_MODEL="gpt-4o"
OPENAI_API_KEY="sk-..."

MCP Servers

Helm MCP: used for chart management and Helm operations.
ArgoCD MCP: used for ArgoCD projects, repositories, applications, and debug actions.
No secrets in chat: repository credentials and tokens are configured on the MCP server side.

Quick Start

Docker (Recommended)

# Pull image
docker pull sandeep2014/k8s-autopilot:latest

# Run agent
docker run -d -p 10102:10102 \
  -e OPENAI_API_KEY=your_key \
  -v ~/.kube/config:/root/.kube/config \
  --name k8s-autopilot \
  sandeep2014/k8s-autopilot:latest

Standalone Installation

# Clone repository
git clone https://github.com/talkops-ai/k8s-autopilot.git
cd k8s-autopilot

# Install with uv
uv venv --python=3.12
source .venv/bin/activate
uv pip install -e .

# Run
uv run --active k8s-autopilot \
  --host 0.0.0.0 \
  --port 10102 \
  --agent-card k8s_autopilot/card/k8s_autopilot.json

LLM Provider Support

k8s-autopilot supports a wide range of providers via a provider-agnostic abstraction layer.

Provider	Status
OpenAI	✅ Supported (GPT-4o, o1-mini)
Anthropic	✅ Supported (Claude 3.5 Sonnet)
Google	✅ Supported (Gemini 1.5 Pro)
Azure/AWS	✅ Supported

📖 Advanced Config: You can mix providers (e.g., o1-mini for Supervisor, gemini-flash for sub-agents). See LLM Provider Onboarding for detailed configuration instructions.

Example Requests

Request	What You’ll See
"Onboard app from repo X at path Y"	Plan preview → approval → ArgoCD create/update
"List ArgoCD apps in project P"	Read-only fetch with fresh state
"Delete app hello-world"	Plan preview → exact-name confirmation → delete
"Sync app checkout-api"	Diff/preview → tool-level approval → sync

Links

Resource	URL
Docker Hub	sandeep2014/k8s-autopilot
GitHub	talkops-ai/k8s-autopilot
Discord	Join Community

Status​

Key Features​

Architecture​

Agent Components​

Workflows​

1. Helm Chart Generation​

2. Helm Management (Active Cluster)​

3. ArgoCD Onboarding Operations​

Use Cases​

Key Benefits​

Prerequisites​

Configuration​

LLM Provider (.env)​

MCP Servers​

Quick Start​

Docker (Recommended)​

Standalone Installation​

LLM Provider Support​

Example Requests​

Links​