Skip to main content

Agent Components

The Kubernetes Agent is composed of multiple specialized autonomous agents, coordinated by a central supervisor. As of v0.3.0, it includes a dedicated ArgoCD onboarding orchestrator and sub-agents.


1. ๐ŸŽฏ Supervisor Agentโ€‹

The Supervisor Agent is the central orchestrator that manages the entire lifecycle of Helm chart generation and cluster operations. It coordinates specialized swarms, manages state, and enforces human-in-the-loop (HITL) safety gates.

Key Responsibilitiesโ€‹

  • Orchestration: Manages workflow phases (Planning โ†’ Generation โ†’ Validation).
  • Delegation: Routes tasks to specialized swarms via tool-based delegation.
  • State Management: Maintains the global state and transforms it for specific swarms.
  • Safety: Enforces mandatory HITL approval gates at critical transition points.

Architectureโ€‹

The Supervisor uses a Tool-Based Delegation pattern, using LangChain's create_agent to dynamically route tasks based on the current workflow state.

Human-in-the-Loop Gatesโ€‹

The Supervisor enforces strict approval gates:

GateTriggerPurpose
Planning ReviewAfter planning completionReview architecture and requirements analysis.
Generation ReviewAfter template generationReview generated artifacts and specify workspace path.
Execution ApprovalBefore cluster changes(Helm Mgmt) Confirm installation/upgrade plans.

2. ๐Ÿ“‹ Planner Agentโ€‹

The Planner Agent is a specialized "Deep Agent" responsible for transforming natural language requirements into a rigorous technical plan. It uses a swarm of sub-agents to analyze requirements, detect gaps, and orchestrate production-ready architectures.

Architecture (Deep Agent)โ€‹

The Planner operates as its own supervisor managing two sub-agents:

Sub-Agentsโ€‹

Sub-AgentRoleKey Tools
Requirements AnalyzerExtract & Validateparse_requirements, classify_complexity, validate_requirements
Architecture PlannerDesign & Sizedesign_k8s_architecture, estimate_resources, check_dependencies

Key Logic: 5-Step Workflowโ€‹

  1. Extract: Parse 12 critical fields (App Type, Framework, Image, Exposure, etc.).
  2. Gap Detection: If critical info (e.g., Image, Port) is missing, pause and ask user via request_human_input.
  3. Analysis: Classify complexity (Simple/Medium/Complex) and validate completeness.
  4. Planning: Design K8s resources (Deployment vs StatefulSet, HPA, PDB) and estimate CPU/Memory.
  5. Compile: Output a structured ChartPlan JSON for the generator.

๐Ÿ’ก Smart Clarification: The agent prioritizes questions. It won't ask for optional details if critical ones (like "What is the docker image?") are missing.


3. โš™๏ธ Template Coordinatorโ€‹

The Template Coordinator is a LangGraph-based agent that orchestrates the execution of 13 specialized tools to generate production-ready Helm chart templates.

Architecture (Coordinator Pattern)โ€‹

Instead of a simple chain, it uses a Coordinator Node to manage dependencies and execution order dynamically.

Execution Phases & Toolsโ€‹

The coordinator executes tools in 4 strict phases to respect dependencies:

PhaseDescriptionKey Tools
1. Core TemplatesEssential resources required for any chart.generate_helpers_tpl, generate_deployment, generate_service
2. ConditionalOptional features based on planner output.generate_hpa, generate_pdb, generate_network_policy, generate_ingress
3. DocumentationNeeds all templates to be finished first.generate_readme (scans all templates)
4. AggregationAssembles final file structure.aggregate_chart

Key Featuresโ€‹

  • Dependency Management: Knows that Ingress requires Service, and Service requires Deployment.
  • Smart Retries: If a tool fails (e.g., LLM error), the Error Handler node retries it up to 3 times.
  • Values Aggregation: The generate_values_yaml tool runs last, collecting all variables used across all templates to ensure nothing is undefined.

4. โœ… Generator (Validator) Agentโ€‹

The Generator Agent (also known as the Validator Deep Agent) focuses on Quality Assurance. It uses a ReAct pattern (Reasoning โ†’ Action โ†’ Observation) to autonomously validate and fix charts.

Tool Stackโ€‹

It combines direct filesystem access with specialized Helm validators:

CategoryToolsPurpose
File Systemls, read_file, write_file, edit_fileInspect structure and apply fixes.
Validationhelm_lint, helm_template, helm_dry_runValidate syntax, rendering, and cluster compatibility.
Escalationask_humanRequest help for complex issues.

Validation Pipelineโ€‹

The agent runs validations sequentially, growing more strict at each step:

๐Ÿฉน Self-Healing Mechanismโ€‹

The agent attempts to fix errors autonomously before bothering the user.

  1. Analyze Error: Detects issues like bad indentation, missing fields, or deprecated APIs.
  2. Apply Fix: Uses edit_file to modify the YAML directly.
  3. Verify: Re-runs the validation tool.
  4. Escalate: If it fails 2 times in a row, it triggers ask_human for manual intervention.

5. ๐Ÿ›ก๏ธ Helm Management Agentโ€‹

The Helm Management Deep Agent is the operational arm ensuring "Safety at Speed". It employs a Dual-Path Architecture to handle both quick queries and high-stakes cluster modifications securely.

Dual-Path Architectureโ€‹

The agent routes requests based on intent classification ("Risk Profile").

The 5-Phase "Safe-Track" Pipelineโ€‹

Used for install, upgrade, rollback, and uninstall operations.

PhaseActivityHITL Gate?
1. DiscoveryDetects if release exists (Upgrade vs Install). Fetches chart info.No
2. ConfirmationValues Confirmation: Shows "Proposed Changes" vs "Current".YES
3. PlanningRuns helm_validate_values, checks prerequisites, generates diffs.No
4. ApprovalPlan Approval: The "Nuclear Button". Final specific sign-off.YES
5. ExecutionPerforms operation (helm_install) & verifies pod health.No

Safety Middlewareโ€‹

  • HelmApprovalHITLMiddleware: The failsafe. Even if the LLM tries to skip approval, this code-level interceptor forces a hard stop before any write operation.
  • ErrorRecoveryMiddleware: Automatically retries flaky read-operations (up to 3 times) to handle network blips.

6. ๐Ÿงญ ArgoCD Onboarding Orchestratorโ€‹

The ArgoCD Onboarding Orchestrator is the control plane for GitOps workflows. It interprets user intent, validates prerequisites, and coordinates the ArgoCD sub-agents with explicit human approvals.

Key Responsibilitiesโ€‹

  • Intent Classification: Determine read-only query vs. workflow (create/update/delete/sync).
  • Prerequisite Checks: Ensure project/repo/app state is known via MCP before acting.
  • Plan Preview: Present a human-friendly plan with what, where, and why.
  • Approval Gates: Require HITL approval for risky operations.

Workflow Phasesโ€‹

  1. Understand: Parse request and required targets.
  2. Validate: Fetch current state (project/repo/app).
  3. Plan: Present a preview and request approval.
  4. Execute: Run MCP tool calls with tool-level approvals.
  5. Verify: Confirm success and summarize changes.

7. ๐Ÿ“ฆ Project Agentโ€‹

Handles ArgoCD project CRUD operations (create/get/list/update/delete), including checks for existing project constraints and permissions.


8. ๐Ÿ—„๏ธ Repository Agentโ€‹

Manages ArgoCD repositories (list/get/onboard/delete) and performs repository connectivity diagnostics.


9. ๐Ÿš€ Application Agentโ€‹

Handles ArgoCD applications: create/update/delete, sync operations, diff previews, and health checks.


10. ๐Ÿงช Debug Agentโ€‹

Fetches ArgoCD application logs and events to assist with troubleshooting workflows.


State Managementโ€‹

The Kubernetes Agent uses a sophisticated state management system designed for resumability, isolation, and type safety. This system allows the Supervisor to seamlessly delegate tasks to sub-agents while maintaining a coherent global history.

๐Ÿงฉ Specialized State Schemasโ€‹

Each agent swarm operates on its own dedicated state schema, optimized for its specific task.

State SchemaUsed ByKey Fields
MainSupervisorStateSupervisoruser_query, workflow_state, active_phase, helm_chart_artifacts
PlanningSwarmStatePlannerrequirements, chart_plan (JSON), gaps_detected
GenerationSwarmStateTemplategenerated_templates, completed_tools, pending_dependencies
ValidationSwarmStateValidatorblocking_issues, validation_results, retry_counts
HelmAgentStateHelm Mgmtchart_metadata, current_release (Live State), execution_plan
ArgoCDOnboardingStateArgoCD Onboardingproject_info, repository_info, application_info, approval_checkpoints

๐Ÿ”„ State Transformersโ€‹

Data is not shared blindly. A StateTransformer middleware explicitly converts data when moving between the Supervisor and Sub-Agents (including the ArgoCD onboarding workflow). This ensures "Context Isolation"โ€”sub-agents see only what they need, preventing hallucination from irrelevant history.

๐Ÿ’พ Persistence & Handoffsโ€‹

  • Checkpointer: All states are persisted to PostgreSQL. This enables long-running workflows where the user might take hours to approve a plan.
  • Interrupts: When a HITL gate is triggered (e.g., in Helm Mgmt Phase 2), the state is saved, execution stops, and the system waits. Upon approval, it resumes exactly where it left off, hydrating the state from the database.