CUGA LogoCUGA AGENT

ConfigUrable Generalist Agent (CUGA)

An autonomous agent capable of performing web actions with intelligent planning and task decomposition

ConfigUrable Generalist Agent (CUGA)

An autonomous agent capable of performing web actions with intelligent planning and task decomposition

Python Status

What is CUGA?

CUGA is a sophisticated autonomous agent that combines the power of large language models with intelligent task planning and execution capabilities. It can perform both browser automation and API interactions, making it a versatile tool for complex workflow automation.

Key Features

  • Autonomous Operation: Self-planning and task decomposition
  • Web Automation: Browser-based task execution
  • API Integration: Seamless API interaction capabilities
  • Intelligent Planning: LLM-powered decision making
  • Workflow Persistence: Save and reuse successful workflows
  • Experiment Tracking: Comprehensive monitoring and analytics

Use Cases

CUGA excels in scenarios requiring:

  • Complex Workflow Automation: Multi-step processes that require decision making
  • API Orchestration: Coordinating multiple API calls with intelligent error handling
  • Web Scraping & Automation: Browser-based data collection and form filling
  • Business Process Automation: Repetitive tasks that benefit from AI-powered optimization
  • Research & Development: Experimental workflows that require adaptive planning

Technology Stack

  • Python 3.12: Core runtime environment
  • UV: Modern Python package management
  • FastAPI: High-performance web framework
  • Selenium/Playwright: Browser automation capabilities
  • OpenAI/LiteLLM: LLM integration for intelligent decision making
  • Docker: Containerized deployment and evaluation

Quick Start

Get started with CUGA in minutes:

# Clone & Setup
git clone git@github.com:cuga-project/cuga-agent.git
cd cuga
uv venv --python=3.12 && source .venv/bin/activate
uv sync

# Configure
cp .env.example .env
# Add your OPENAI_API_KEY to .env

# Test Code Sandbox
uv run test_sandbox

# Run Demo
cuga start demo

Documentation Structure

This documentation is organized into four main sections:

1. Getting Started

2. Usage

3. Evaluation

4. Development

Prerequisites

Before getting started, ensure you have:

ToolPurposeInstallation
UVPython project managerInstall Guide
Rancher DesktopContainer managementDownload
OpenAI API KeyLLM accessAdd to .env file

CUGA Team: Use the ETE LiteLLM API key

Demo Mode

CUGA comes with a pre-configured demo featuring the Digital Sales API:

# Start the demo
cuga start demo

# Try this example query:
"get my top account by revenue from digital sales"

Demo Pages

We've created demo HTML pages for testing different CUGA modes:

Download and open these pages in your browser to test CUGA's capabilities in each mode.

Experiment Tracking

Monitor your CUGA experiments with built-in analytics:

# View experiment dashboard
cuga exp

# Start dashboard for specific experiment
# Click "Start Dashboard" in the interface

Execution Modes

CUGA offers multiple execution modes optimized for different use cases:

ModeSpeedAccuracyUse Case
Fast⭐⭐⭐⭐⭐⭐⭐⭐Development, testing
Accurate⭐⭐⭐⭐⭐⭐⭐⭐Production, critical tasks
Custom⭐⭐⭐⭐⭐⭐⭐⭐Tailored workflows
Save & Reuse⭐⭐⭐⭐⭐⭐⭐⭐⭐Repeated workflows

Evaluation Benchmarks

Test CUGA with industry-standard benchmarks:

  • AppWorld: Real-world web application testing
  • WebArena: Web automation and navigation testing
  • WxO Tools: Watson Orchestrate integration testing

Configuration

Model Providers

CUGA supports multiple LLM providers:

  • ETE LiteLLM (Recommended for IBM teams) - Free, high-performance
  • OpenAI - Popular choice with excellent model quality
  • IBM WatsonX - Enterprise-grade AI platform
  • Azure OpenAI - Microsoft's managed service

Environment Setup

# Switch between providers
export AGENT_SETTING_CONFIG="settings.litellm.toml"  # ETE (default)
export AGENT_SETTING_CONFIG="settings.openai.toml"   # OpenAI
export AGENT_SETTING_CONFIG="settings.watsonx.toml"  # WatsonX

Troubleshooting

Common Issues

IssueSolution
Port conflictsCheck if ports 8000, 8005, 8080 are free
Docker errorsEnsure Rancher Desktop is running
API key errorsVerify .env file has correct OPENAI_API_KEY
Module not foundRun uv sync to install dependencies

Debug Commands

# Check service status
cuga status

# View logs
cuga logs --tail

# Check configuration
cuga config show

# Run diagnostics
cuga diagnose

Testing

Run the comprehensive test suite:

# Run all tests
uv run pytest -v

# Run specific test categories
uv run pytest tests/unit/           # Unit tests
uv run pytest tests/integration/    # Integration tests
uv run pytest tests/system/         # System tests

# Run with coverage
uv run pytest --cov=cuga --cov-report=html

Performance Monitoring

Monitor CUGA performance and health:

# Real-time monitoring
cuga monitor

# Performance metrics
cuga stats

# Resource usage
cuga stats --resources

Continuous Integration

Automate testing and evaluation:

# Daily evaluation
0 2 * * * cd /path/to/cuga && uv run appworld_eval --eval_key daily_test

# Weekly comprehensive evaluation
0 3 * * 0 cd /path/to/cuga && uv run appworld_eval --eval_key weekly_full

Resources

Contributing

We welcome contributions from the IBM community:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Submit a pull request

License

This project is proprietary to IBM Corporation. All rights reserved.


Get Started →