An autonomous agent capable of performing web actions with intelligent planning and task decomposition

ConfigUrable Generalist Agent (CUGA)

An autonomous agent capable of performing web actions with intelligent planning and task decomposition

What is CUGA?

CUGA is a sophisticated autonomous agent that combines the power of large language models with intelligent task planning and execution capabilities. It can perform both browser automation and API interactions, making it a versatile tool for complex workflow automation.

Key Features

Autonomous Operation: Self-planning and task decomposition
Web Automation: Browser-based task execution
API Integration: Seamless API interaction capabilities
Intelligent Planning: LLM-powered decision making
Workflow Persistence: Save and reuse successful workflows
Experiment Tracking: Comprehensive monitoring and analytics

🔗 Important Links

cuga-website - Official website
Enablement-session - Training materials

Use Cases

CUGA excels in scenarios requiring:

Complex Workflow Automation: Multi-step processes that require decision making
API Orchestration: Coordinating multiple API calls with intelligent error handling
Web Scraping & Automation: Browser-based data collection and form filling
Business Process Automation: Repetitive tasks that benefit from AI-powered optimization
Research & Development: Experimental workflows that require adaptive planning

Technology Stack

Python 3.12: Core runtime environment
UV: Modern Python package management
FastAPI: High-performance web framework
Selenium/Playwright: Browser automation capabilities
OpenAI/LiteLLM: LLM integration for intelligent decision making
Docker: Containerized deployment and evaluation

Quick Start

Get started with CUGA in minutes:

# Clone & Setup
git clone git@github.com:cuga-project/cuga-agent.git
cd cuga
uv venv --python=3.12 && source .venv/bin/activate
uv sync

# Configure
cp .env.example .env
# Add your OPENAI_API_KEY to .env

# Test Code Sandbox
uv run test_sandbox

# Run Demo
cuga start demo

Documentation Structure

This documentation is organized into four main sections:

1. Getting Started

Introduction - Overview and key concepts
Quick Start - Get up and running quickly
Installation - Complete setup instructions
Configuration - Model setup, environment, and settings

2. Usage

Demo Mode - Learn CUGA with pre-configured demo
Control Commands - Master CUGA CLI
API Integration - Add your own APIs and tools
Save & Reuse - Workflow persistence and optimization
Execution Modes - Fast, accurate, and custom modes

3. Evaluation

AppWorld Evaluation - Test with AppWorld benchmark
WebArena Evaluation - Test with WebArena benchmark
WxO Tools Evaluation - Integrate with Watson Orchestrate
Docker Parallel Evaluation - Scale evaluation with containers
Experiment Tracking - Monitor and analyze results

4. Development

Testing - Comprehensive testing strategies
Troubleshooting - Debug common issues
Debugging - Advanced debugging techniques
API Reference - Complete API documentation

Prerequisites

Before getting started, ensure you have:

Tool	Purpose	Installation
UV	Python project manager	Install Guide
Rancher Desktop	Container management	Download
OpenAI API Key	LLM access	Add to `.env` file

CUGA Team: Use the ETE LiteLLM API key

Demo Mode

CUGA comes with a pre-configured demo featuring the Digital Sales API:

# Start the demo
cuga start demo

# Try this example query:
"get my top account by revenue from digital sales"

Demo Pages

We've created demo HTML pages for testing different CUGA modes:

Hybrid Demo: Test browser + API integration
Browser Demo: Test pure web automation
API Demo: Test API-only operations

Download and open these pages in your browser to test CUGA's capabilities in each mode.

Experiment Tracking

Monitor your CUGA experiments with built-in analytics:

# View experiment dashboard
cuga exp

# Start dashboard for specific experiment
# Click "Start Dashboard" in the interface

Execution Modes

CUGA offers multiple execution modes optimized for different use cases:

Mode	Speed	Accuracy	Use Case
Fast	⭐⭐⭐⭐⭐	⭐⭐⭐	Development, testing
Accurate	⭐⭐⭐	⭐⭐⭐⭐⭐	Production, critical tasks
Custom	⭐⭐⭐⭐	⭐⭐⭐⭐	Tailored workflows
Save & Reuse	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	Repeated workflows

Evaluation Benchmarks

Test CUGA with industry-standard benchmarks:

AppWorld: Real-world web application testing
WebArena: Web automation and navigation testing
WxO Tools: Watson Orchestrate integration testing

Configuration

Model Providers

CUGA supports multiple LLM providers:

ETE LiteLLM (Recommended for IBM teams) - Free, high-performance
OpenAI - Popular choice with excellent model quality
IBM WatsonX - Enterprise-grade AI platform
Azure OpenAI - Microsoft's managed service

Environment Setup

# Switch between providers
export AGENT_SETTING_CONFIG="settings.litellm.toml"  # ETE (default)
export AGENT_SETTING_CONFIG="settings.openai.toml"   # OpenAI
export AGENT_SETTING_CONFIG="settings.watsonx.toml"  # WatsonX

Troubleshooting

Common Issues

Issue	Solution
Port conflicts	Check if ports 8000, 8005, 8080 are free
Docker errors	Ensure Rancher Desktop is running
API key errors	Verify `.env` file has correct `OPENAI_API_KEY`
Module not found	Run `uv sync` to install dependencies

Debug Commands

# Check service status
cuga status

# View logs
cuga logs --tail

# Check configuration
cuga config show

# Run diagnostics
cuga diagnose

Testing

Run the comprehensive test suite:

# Run all tests
uv run pytest -v

# Run specific test categories
uv run pytest tests/unit/           # Unit tests
uv run pytest tests/integration/    # Integration tests
uv run pytest tests/system/         # System tests

# Run with coverage
uv run pytest --cov=cuga --cov-report=html

Performance Monitoring

Monitor CUGA performance and health:

# Real-time monitoring
cuga monitor

# Performance metrics
cuga stats

# Resource usage
cuga stats --resources

Continuous Integration

Automate testing and evaluation:

# Daily evaluation
0 2 * * * cd /path/to/cuga && uv run appworld_eval --eval_key daily_test

# Weekly comprehensive evaluation
0 3 * * 0 cd /path/to/cuga && uv run appworld_eval --eval_key weekly_full

Resources

Contributing

We welcome contributions from the IBM community:

Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Submit a pull request

License

Get Started →

ConfigUrable Generalist Agent (CUGA)

On this page