AWS Orchestrator Troubleshooting
Common issues and solutions for the AWS Orchestrator Agent.
Installation Issues​
Python Version​
Error: Python 3.12+ required
Solution:
# Check Python version
python --version
# Install Python 3.12+ using pyenv
pyenv install 3.12.0
pyenv local 3.12.0
Dependency Installation​
Error: Failed to install dependencies
Solution:
# Using uv (recommended)
uv venv --python=3.12
source .venv/bin/activate
uv pip install -e .
# Alternative: pip
pip install -e .
API Key Issues​
Missing API Key​
Error: OPENAI_API_KEY not set
Solution:
- Create
.envfile in project root:
OPENAI_API_KEY=sk-your-key-here
- Or export environment variable:
export OPENAI_API_KEY=sk-your-key-here
Invalid API Key​
Error: Authentication failed: Invalid API key
Solution:
- Verify key is correct and not expired
- Check key has sufficient credits/quota
- Ensure key has access to required models
A2A Server Issues​
Port Already in Use​
Error: Address already in use: port 10102
Solution:
# Find process using port
lsof -i :10102
# Kill process
kill -9 <PID>
# Or use different port
aws-orchestrator-agent --port 10103
Connection Refused​
Error: Connection refused to localhost:10102
Checklist:
- Server is running
- Correct host/port in client
- Firewall allows connection
- Docker container networking (if using Docker)
Docker networking fix:
# Use host network mode
docker run --network host \
-e OPENAI_API_KEY=your_key \
sandeep2014/aws-orchestrator-agent:latest
Client Issues​
Client Import Errors​
Error: ModuleNotFoundError: No module named 'httpx'
Solution:
pip install httpx colorama
Client Can't Connect​
Error: Connection refused or Failed to connect to agent
Checklist:
- Server is running (
docker logs aws-orchestrator) - Using correct URL (
http://localhost:10102) - Port is not blocked by firewall
- For Docker: container is running (
docker ps)
Test server is reachable:
curl http://localhost:10102
Session Issues​
Symptom: Client shows old/stale session data
Solution:
# Start fresh session
python aws_orchestrator_client/client.py --session $(date +%s)
LLM Issues​
Model Not Available​
Error: Model gpt-4o not found
Solution:
Update .env to use available model:
LLM_MODEL=gpt-4o-mini
Rate Limiting​
Error: Rate limit exceeded
Solution:
- Wait and retry
- Reduce concurrent workflows
- Use model with higher rate limits
Timeout​
Error: Request timeout after 300s
Solution:
SUPERVISOR_TIMEOUT_SECONDS=600
LLM_MAX_TOKENS=10000
Generation Issues​
Incomplete Output​
Symptom: Module files are incomplete
Possible Causes:
- Token limit reached
- LLM timeout
- Agent handoff failure
Solutions:
- Increase token limits:
LLM_MAX_TOKENS=25000
LLM_REACT_AGENT_MAX_TOKENS=30000
- Check agent logs for errors:
tail -f aws_orchestrator_agent.log
HCL Syntax Errors​
Error: Invalid HCL syntax in generated file
Solution:
- The Writer Agent validates syntax before writing
- Check logs for validation errors
- Report issue if persistent
Common Misunderstandings​
"Agent is Stuck"​
Symptom: No output for several minutes
Reality: This is normal! The agent performs deep research and takes 20-25 minutes for enterprise-grade modules.
What's happening:
- 0-7 min: Planner phase (requirements analysis, execution planning)
- 7-19 min: Generator phase (7 agents generating Terraform code)
- 19-25 min: Writer phase (validation and file writing)
Tip: Enable debug logging to see progress:
LOG_LEVEL=DEBUG
LOG_TO_CONSOLE=True
"Only One Module Generated"​
Symptom: Asked for multiple services but only got one module
Reality: The agent generates one service module at a time. If you request multiple services, only the first service mentioned is processed.
Solution: Submit separate requests for each service:
# Request 1
"Create an S3 bucket module"
# Request 2 (after first completes)
"Create an RDS PostgreSQL module"
Docker Issues​
Container Exits Immediately​
Symptom: Container starts and stops
Solution:
# Check logs
docker logs aws-orchestrator
# Run in foreground to see errors
docker run -it \
-e OPENAI_API_KEY=your_key \
sandeep2014/aws-orchestrator-agent:latest
Volume Permission Issues​
Error: Permission denied writing to /app/modules
Solution:
# Fix ownership
docker run \
-u $(id -u):$(id -g) \
-v $(pwd)/modules:/app/modules \
sandeep2014/aws-orchestrator-agent:latest
Debug Mode​
Enable Debug Logging​
LOG_LEVEL=DEBUG
LOG_TO_CONSOLE=True
LOG_STRUCTURED_JSON=False
View Agent State​
The supervisor tracks workflow state:
- Current phase (Planner, Generator, Writer)
- Agent handoffs
- Error recovery attempts
Check Workflow Status​
# View recent logs
tail -100 aws_orchestrator_agent.log
# Filter by component
grep "SupervisorAgent" aws_orchestrator_agent.log
grep "GeneratorSwarm" aws_orchestrator_agent.log
grep "WriterAgent" aws_orchestrator_agent.log
Getting Help​
Community Support​
- Discord: Join Community
- GitHub Issues: Report bugs and feature requests
Information to Include​
When reporting issues, include:
- Error message (full stack trace)
- Configuration (sanitized, no API keys)
- Steps to reproduce
- Python version
- Docker version (if applicable)