Skip to main content

AWS Orchestrator Troubleshooting

Common issues and solutions for the AWS Orchestrator Agent.


Installation Issues​

Python Version​

Error: Python 3.12+ required

Solution:

# Check Python version
python --version

# Install Python 3.12+ using pyenv
pyenv install 3.12.0
pyenv local 3.12.0

Dependency Installation​

Error: Failed to install dependencies

Solution:

# Using uv (recommended)
uv venv --python=3.12
source .venv/bin/activate
uv pip install -e .

# Alternative: pip
pip install -e .

API Key Issues​

Missing API Key​

Error: OPENAI_API_KEY not set

Solution:

  1. Create .env file in project root:
OPENAI_API_KEY=sk-your-key-here
  1. Or export environment variable:
export OPENAI_API_KEY=sk-your-key-here

Invalid API Key​

Error: Authentication failed: Invalid API key

Solution:

  • Verify key is correct and not expired
  • Check key has sufficient credits/quota
  • Ensure key has access to required models

A2A Server Issues​

Port Already in Use​

Error: Address already in use: port 10102

Solution:

# Find process using port
lsof -i :10102

# Kill process
kill -9 <PID>

# Or use different port
aws-orchestrator-agent --port 10103

Connection Refused​

Error: Connection refused to localhost:10102

Checklist:

  • Server is running
  • Correct host/port in client
  • Firewall allows connection
  • Docker container networking (if using Docker)

Docker networking fix:

# Use host network mode
docker run --network host \
-e OPENAI_API_KEY=your_key \
sandeep2014/aws-orchestrator-agent:latest

Client Issues​

Client Import Errors​

Error: ModuleNotFoundError: No module named 'httpx'

Solution:

pip install httpx colorama

Client Can't Connect​

Error: Connection refused or Failed to connect to agent

Checklist:

  • Server is running (docker logs aws-orchestrator)
  • Using correct URL (http://localhost:10102)
  • Port is not blocked by firewall
  • For Docker: container is running (docker ps)

Test server is reachable:

curl http://localhost:10102

Session Issues​

Symptom: Client shows old/stale session data

Solution:

# Start fresh session
python aws_orchestrator_client/client.py --session $(date +%s)

LLM Issues​

Model Not Available​

Error: Model gpt-4o not found

Solution: Update .env to use available model:

LLM_MODEL=gpt-4o-mini

Rate Limiting​

Error: Rate limit exceeded

Solution:

  • Wait and retry
  • Reduce concurrent workflows
  • Use model with higher rate limits

Timeout​

Error: Request timeout after 300s

Solution:

SUPERVISOR_TIMEOUT_SECONDS=600
LLM_MAX_TOKENS=10000

Generation Issues​

Incomplete Output​

Symptom: Module files are incomplete

Possible Causes:

  • Token limit reached
  • LLM timeout
  • Agent handoff failure

Solutions:

  1. Increase token limits:
LLM_MAX_TOKENS=25000
LLM_REACT_AGENT_MAX_TOKENS=30000
  1. Check agent logs for errors:
tail -f aws_orchestrator_agent.log

HCL Syntax Errors​

Error: Invalid HCL syntax in generated file

Solution:

  • The Writer Agent validates syntax before writing
  • Check logs for validation errors
  • Report issue if persistent

Common Misunderstandings​

"Agent is Stuck"​

Symptom: No output for several minutes

Reality: This is normal! The agent performs deep research and takes 20-25 minutes for enterprise-grade modules.

What's happening:

  • 0-7 min: Planner phase (requirements analysis, execution planning)
  • 7-19 min: Generator phase (7 agents generating Terraform code)
  • 19-25 min: Writer phase (validation and file writing)

Tip: Enable debug logging to see progress:

LOG_LEVEL=DEBUG
LOG_TO_CONSOLE=True

"Only One Module Generated"​

Symptom: Asked for multiple services but only got one module

Reality: The agent generates one service module at a time. If you request multiple services, only the first service mentioned is processed.

Solution: Submit separate requests for each service:

# Request 1
"Create an S3 bucket module"

# Request 2 (after first completes)
"Create an RDS PostgreSQL module"

Docker Issues​

Container Exits Immediately​

Symptom: Container starts and stops

Solution:

# Check logs
docker logs aws-orchestrator

# Run in foreground to see errors
docker run -it \
-e OPENAI_API_KEY=your_key \
sandeep2014/aws-orchestrator-agent:latest

Volume Permission Issues​

Error: Permission denied writing to /app/modules

Solution:

# Fix ownership
docker run \
-u $(id -u):$(id -g) \
-v $(pwd)/modules:/app/modules \
sandeep2014/aws-orchestrator-agent:latest

Debug Mode​

Enable Debug Logging​

LOG_LEVEL=DEBUG
LOG_TO_CONSOLE=True
LOG_STRUCTURED_JSON=False

View Agent State​

The supervisor tracks workflow state:

  • Current phase (Planner, Generator, Writer)
  • Agent handoffs
  • Error recovery attempts

Check Workflow Status​

# View recent logs
tail -100 aws_orchestrator_agent.log

# Filter by component
grep "SupervisorAgent" aws_orchestrator_agent.log
grep "GeneratorSwarm" aws_orchestrator_agent.log
grep "WriterAgent" aws_orchestrator_agent.log

Getting Help​

Community Support​

Information to Include​

When reporting issues, include:

  • Error message (full stack trace)
  • Configuration (sanitized, no API keys)
  • Steps to reproduce
  • Python version
  • Docker version (if applicable)