AWS Orchestrator Troubleshooting

Common issues and solutions for the AWS Orchestrator Agent.

Installation Issues

Python Version

Error: Python 3.12+ required

Solution:

# Check Python version
python --version

# Install Python 3.12+ using pyenv
pyenv install 3.12.0
pyenv local 3.12.0

Dependency Installation

Error: Failed to install dependencies

Solution:

# Using uv (recommended)
uv venv --python=3.12
source .venv/bin/activate
uv pip install -e .

# Alternative: pip
pip install -e .

API Key Issues

Missing API Key

Error: OPENAI_API_KEY not set

Solution:

Create .env file in project root:

OPENAI_API_KEY=sk-your-key-here

Or export environment variable:

export OPENAI_API_KEY=sk-your-key-here

Invalid API Key

Error: Authentication failed: Invalid API key

Solution:

Verify key is correct and not expired
Check key has sufficient credits/quota
Ensure key has access to required models

A2A Server Issues

Port Already in Use

Error: Address already in use: port 10102

Solution:

# Find process using port
lsof -i :10102

# Kill process
kill -9 <PID>

# Or use different port
aws-orchestrator-agent --port 10103

Connection Refused

Error: Connection refused to localhost:10102

Checklist:

Server is running
Correct host/port in client
Firewall allows connection
Docker container networking (if using Docker)

Docker networking fix:

# Use host network mode
docker run --network host \
  -e OPENAI_API_KEY=your_key \
  sandeep2014/aws-orchestrator-agent:latest

Client Issues

Client Import Errors

Error: ModuleNotFoundError: No module named 'httpx'

Solution:

pip install httpx colorama

Client Can't Connect

Error: Connection refused or Failed to connect to agent

Checklist:

Server is running (docker logs aws-orchestrator)
Using correct URL (http://localhost:10102)
Port is not blocked by firewall
For Docker: container is running (docker ps)

Test server is reachable:

curl http://localhost:10102

Session Issues

Symptom: Client shows old/stale session data

Solution:

# Start fresh session
python aws_orchestrator_client/client.py --session $(date +%s)

LLM Issues

Model Not Available

Error: Model gpt-4o not found

Solution: Update .env to use available model:

LLM_MODEL=gpt-4o-mini

Rate Limiting

Error: Rate limit exceeded

Solution:

Wait and retry
Reduce concurrent workflows
Use model with higher rate limits

Timeout

Error: Request timeout after 300s

Solution:

SUPERVISOR_TIMEOUT_SECONDS=600
LLM_MAX_TOKENS=10000

Generation Issues

Incomplete Output

Symptom: Module files are incomplete

Possible Causes:

Token limit reached
LLM timeout
Agent handoff failure

Solutions:

Increase token limits:

LLM_MAX_TOKENS=25000
LLM_REACT_AGENT_MAX_TOKENS=30000

Check agent logs for errors:

tail -f aws_orchestrator_agent.log

HCL Syntax Errors

Error: Invalid HCL syntax in generated file

Solution:

The Writer Agent validates syntax before writing
Check logs for validation errors
Report issue if persistent

Common Misunderstandings

"Agent is Stuck"

Symptom: No output for several minutes

Reality: This is normal! The agent performs deep research and takes 20-25 minutes for enterprise-grade modules.

What's happening:

0-7 min: Planner phase (requirements analysis, execution planning)
7-19 min: Generator phase (7 agents generating Terraform code)
19-25 min: Writer phase (validation and file writing)

Tip: Enable debug logging to see progress:

LOG_LEVEL=DEBUG
LOG_TO_CONSOLE=True

"Only One Module Generated"

Symptom: Asked for multiple services but only got one module

Reality: The agent generates one service module at a time. If you request multiple services, only the first service mentioned is processed.

Solution: Submit separate requests for each service:

# Request 1
"Create an S3 bucket module"

# Request 2 (after first completes)
"Create an RDS PostgreSQL module"

Docker Issues

Container Exits Immediately

Symptom: Container starts and stops

Solution:

# Check logs
docker logs aws-orchestrator

# Run in foreground to see errors
docker run -it \
  -e OPENAI_API_KEY=your_key \
  sandeep2014/aws-orchestrator-agent:latest

Volume Permission Issues

Error: Permission denied writing to /app/modules

Solution:

# Fix ownership
docker run \
  -u $(id -u):$(id -g) \
  -v $(pwd)/modules:/app/modules \
  sandeep2014/aws-orchestrator-agent:latest

Debug Mode

Enable Debug Logging

LOG_LEVEL=DEBUG
LOG_TO_CONSOLE=True
LOG_STRUCTURED_JSON=False

View Agent State

The supervisor tracks workflow state:

Current phase (Planner, Generator, Writer)
Agent handoffs
Error recovery attempts

Check Workflow Status

# View recent logs
tail -100 aws_orchestrator_agent.log

# Filter by component
grep "SupervisorAgent" aws_orchestrator_agent.log
grep "GeneratorSwarm" aws_orchestrator_agent.log
grep "WriterAgent" aws_orchestrator_agent.log

Getting Help

Community Support

Discord: Join Community
GitHub Issues: Report bugs and feature requests

Information to Include

When reporting issues, include:

Error message (full stack trace)
Configuration (sanitized, no API keys)
Steps to reproduce
Python version
Docker version (if applicable)

Installation Issues​

Python Version​

Dependency Installation​

API Key Issues​

Missing API Key​

Invalid API Key​

A2A Server Issues​

Port Already in Use​

Connection Refused​

Client Issues​

Client Import Errors​

Client Can't Connect​

Session Issues​

LLM Issues​

Model Not Available​

Rate Limiting​

Timeout​

Generation Issues​

Incomplete Output​

HCL Syntax Errors​

Common Misunderstandings​

"Agent is Stuck"​

"Only One Module Generated"​

Docker Issues​

Container Exits Immediately​

Volume Permission Issues​

Debug Mode​

Enable Debug Logging​

View Agent State​

Check Workflow Status​

Getting Help​

Community Support​

Information to Include​

Installation Issues

Python Version

Dependency Installation

API Key Issues

Missing API Key

Invalid API Key

A2A Server Issues

Port Already in Use

Connection Refused

Client Issues

Client Import Errors

Client Can't Connect

Session Issues

LLM Issues

Model Not Available

Rate Limiting

Timeout

Generation Issues

Incomplete Output

HCL Syntax Errors

Common Misunderstandings

"Agent is Stuck"

"Only One Module Generated"

Docker Issues

Container Exits Immediately

Volume Permission Issues

Debug Mode

Enable Debug Logging

View Agent State

Check Workflow Status

Getting Help

Community Support

Information to Include