Skip to main content

Kubernetes Agent

k8s-autopilot is an intelligent, multi-agent framework that automates the complete lifecycle of Kubernetes operations. Built on LangChain and LangGraph, it acts as a unified platform for Helm Chart Generation, Active Cluster Management, and ArgoCD onboarding operations with human-in-the-loop safety.


Statusโ€‹

AspectDetails
Statusโœ… Ready
Versionv0.3.0
Maintained ByTalkOps Team
MCP ServerHelm MCP, ArgoCD MCP
ProtocolGoogle A2A Protocol
FrameworkLangGraph Multi-Agent
GitHubtalkops-ai/k8s-autopilot

Key Featuresโ€‹

FeatureDescription
๐Ÿ“‹ PlanningAnalyzes requirements, validates completeness, and designs Kubernetes architecture
โš™๏ธ GenerationGenerates Helm templates, values files, and documentation
โœ… ValidationValidates charts, performs security scanning, and ensures production readiness
๐ŸŽฎ Active ManagementInstalls, upgrades, and rolls back Helm releases on active clusters
๐Ÿงญ ArgoCD OperationsManage ArgoCD projects, repositories, and applications with previews and approvals
๐Ÿ”„ Self-HealingAutomatically fixes common errors (YAML indentation, deprecated APIs)
๐Ÿ‘ค Human-in-the-LoopRequests approvals at critical workflow points for safety

Architectureโ€‹

The Kubernetes Agent uses a Supervisor-Worker architecture with specialized deep agent swarms.

Agent Componentsโ€‹

AgentRole
Supervisor AgentCentral orchestrator, manages state flow, HITL gates, and delegation
Planner AgentRequirement analysis, gap detection, and architecture planning
Template CoordinatorGenerates Helm chart templates, values.yaml, and READMEs
Generator (Validator)Validates charts, self-heals errors, and ensures production readiness
Helm Management AgentSpecialized agent for active cluster operations via MCP
ArgoCD Onboarding OrchestratorRoutes ArgoCD workflows and coordinates plan/approval gates
Project AgentArgoCD project CRUD
Repository AgentArgoCD repository list/get/onboard/delete
Application AgentArgoCD app lifecycle and sync operations
Debug AgentArgoCD logs/events collection

Workflowsโ€‹

The agent supports three primary workflows managed by the Supervisor.

1. Helm Chart Generationโ€‹

2. Helm Management (Active Cluster)โ€‹

3. ArgoCD Onboarding Operationsโ€‹


Use Casesโ€‹

Use CaseDescription
๐Ÿ—๏ธ Chart GenerationCreate production-ready Helm charts from natural language descriptions
๐Ÿš€ App DeploymentDeploy complex applications to Kubernetes with best-practice configurations
๐Ÿ”„ Lifecycle ManagementUpgrade and rollback Helm releases safely with state awareness
๐Ÿงญ ArgoCD OnboardingOnboard projects, repos, and apps with approvals and previews
๐Ÿ” Cluster DiscoveryQuery and inspect existing releases and cluster resources
๐Ÿงช ArgoCD TroubleshootingFetch ArgoCD app logs and events when debugging

Key Benefitsโ€‹

BenefitDetails
Modular Swarm DesignIndependently deployable and scalable agent swarms
Stateful OrchestrationLangGraph-based state management with checkpoints
Tool-Based DelegationDynamic routing to specialized agents
Human-Centric SafetyStrict HITL gates for all state-changing operations
Self-Healing CapabilitiesAutonomous fixing of common YAML and configuration errors
Fresh-State DecisionsReads live cluster/ArgoCD state via MCP before acting

Prerequisitesโ€‹

RequirementDetails
Python3.12+
Helm CLIRequired for local validation
KubernetesActive cluster (for management features)
ArgoCD MCP ServerRequired for ArgoCD operations (credentials managed server-side)
LLM API KeyOpenAI, Anthropic, or others
TalkOps ClientFor interaction

Configurationโ€‹

LLM Provider (.env)โ€‹

LLM_PROVIDER="openai"
LLM_MODEL="gpt-4o"
OPENAI_API_KEY="sk-..."

MCP Serversโ€‹

  • Helm MCP: used for chart management and Helm operations.
  • ArgoCD MCP: used for ArgoCD projects, repositories, applications, and debug actions.
  • No secrets in chat: repository credentials and tokens are configured on the MCP server side.

Quick Startโ€‹

# Pull image
docker pull sandeep2014/k8s-autopilot:latest

# Run agent
docker run -d -p 10102:10102 \
-e OPENAI_API_KEY=your_key \
-v ~/.kube/config:/root/.kube/config \
--name k8s-autopilot \
sandeep2014/k8s-autopilot:latest

Standalone Installationโ€‹

# Clone repository
git clone https://github.com/talkops-ai/k8s-autopilot.git
cd k8s-autopilot

# Install with uv
uv venv --python=3.12
source .venv/bin/activate
uv pip install -e .

# Run
uv run --active k8s-autopilot \
--host 0.0.0.0 \
--port 10102 \
--agent-card k8s_autopilot/card/k8s_autopilot.json

LLM Provider Supportโ€‹

k8s-autopilot supports a wide range of providers via a provider-agnostic abstraction layer.

ProviderStatus
OpenAIโœ… Supported (GPT-4o, o1-mini)
Anthropicโœ… Supported (Claude 3.5 Sonnet)
Googleโœ… Supported (Gemini 1.5 Pro)
Azure/AWSโœ… Supported

๐Ÿ“– Advanced Config: You can mix providers (e.g., o1-mini for Supervisor, gemini-flash for sub-agents). See LLM Provider Onboarding for detailed configuration instructions.


Example Requestsโ€‹

RequestWhat Youโ€™ll See
"Onboard app from repo X at path Y"Plan preview โ†’ approval โ†’ ArgoCD create/update
"List ArgoCD apps in project P"Read-only fetch with fresh state
"Delete app hello-world"Plan preview โ†’ exact-name confirmation โ†’ delete
"Sync app checkout-api"Diff/preview โ†’ tool-level approval โ†’ sync

ResourceURL
Docker Hubsandeep2014/k8s-autopilot
GitHubtalkops-ai/k8s-autopilot
DiscordJoin Community