Quick Start Β· Connectors Β· Stealth Β· Adaptive Β· Commands Β· Architecture Β· Deployment
Agent-OS gives AI agents a real browser β persistent, stealthy, and self-hosted. It ships 203 tools for navigation, form filling, data extraction, CAPTCHA bypass, adaptive scraping, and more. Works with Claude, GPT-4, Codex, OpenClaw, and any agent that can send an HTTP request.
One command to install. One config to connect. Zero API keys needed.
curl -sSL https://raw.githubusercontent.com/factspark23-hash/Agent-OS/main/install.sh | bash| Problem | Agent-OS Solution |
|---|---|
| AI agents can't interact with websites | Real Chromium browser with 203 tools |
| Bot detection blocks automation | 26+ anti-detection vectors, Cloudflare bypass |
| Website changes break selectors | Adaptive scraper β learns element fingerprints, auto-relocates |
| Manual login required | Login handoff β pause AI, human logs in, resume |
| Single IP gets blocked | Proxy rotation with 4 strategies + health tracking |
| LLM token waste on browser output | SmartCompressor β 87% token savings |
| Need multiple AI platforms | 7 connectors β MCP, OpenAI, Claude, CLI, REST, OpenClaw |
curl -sSL https://raw.githubusercontent.com/factspark23-hash/Agent-OS/main/install.sh | bash# With options
curl -sSL .../install.sh | bash -s -- --token my-secret-token
curl -sSL .../install.sh | bash -s -- --headed # Show browser
curl -sSL .../install.sh | bash -s -- --port 9000 # Custom portgit clone https://github.com/factspark23-hash/Agent-OS.git
cd Agent-OS
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
python3 -m patchright install chromium
export JWT_SECRET_KEY=$(python3 -c 'import secrets; print(secrets.token_urlsafe(48))')
python3 main.py --agent-token "your-token"git clone https://github.com/factspark23-hash/Agent-OS.git
cd Agent-OS
export POSTGRES_PASSWORD="strong-password"
docker compose up -d# Health check
curl http://localhost:8001/health
# Navigate
curl -X POST http://localhost:8001/command \
-H "Content-Type: application/json" \
-d '{"token":"your-token","command":"navigate","url":"https://github.com"}'
# Screenshot
curl -X POST http://localhost:8001/command \
-H "Content-Type: application/json" \
-d '{"token":"your-token","command":"screenshot"}'
# Click by text (no CSS selector needed)
curl -X POST http://localhost:8001/command \
-H "Content-Type: application/json" \
-d '{"token":"your-token","command":"smart-click","text":"Sign in"}'All 203 tools available in every connector:
| Connector | Tools | Use With | API Key? |
|---|---|---|---|
| MCP Passthrough β | 203 | Claude Desktop, Claude Code, Codex | β No |
| MCP Server | 203 | Claude Desktop, Claude Code, Codex | Optional |
| OpenAI | 203 | GPT-4, GPT-4o, any OpenAI-compatible | Yes |
| Claude API | 203 | Claude API (tool-use format) | Yes |
| OpenClaw | 203 | OpenClaw agent framework | Optional |
| CLI (Bash) | 202 | Any language (Python, Node, Go...) | Token |
| HTTP REST | 202 | Direct API calls | Token |
./run_mcp.sh --token "my-secret-token"Claude Desktop config:
{
"mcpServers": {
"agent-os": {
"command": "python3",
"args": ["/path/to/Agent-OS/connectors/mcp_passthrough.py"],
"env": {
"AGENT_OS_URL": "http://localhost:8001",
"AGENT_OS_TOKEN": "my-secret-token",
"AGENT_OS_COMPRESS": "aggressive"
}
}
}
}Agent-OS defeats bot detection with a 4-layer defense system:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Layer 1: Network β
β Chrome TLS fingerprint (JA3/JA4) via curl_cffi β
β HTTP/2 matching β’ Bot scripts blocked at network level β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Layer 2: CDP (Chrome DevTools Protocol) β
β Page.addScriptToEvaluateOnNewDocument injection β
β User-Agent metadata spoofing β’ Timezone override β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Layer 3: JavaScript (19 injection modules) β
β navigator.webdriver removal β’ CDP property filtering β
β WebGL/Canvas/Audio fingerprint spoofing β
β WebRTC IP leak prevention β’ Function toString masking β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Layer 4: Behavior β
β Bezier-curve mouse movements β’ Realistic typing rhythms β
β Word pause simulation β’ Typo + correction (3% rate) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Blocked vendors: DataDome, PerimeterX, Imperva, Akamai, Cloudflare Bot Management, Turnstile, Kasada, Shape Security, F5, Arkose Labs, ThreatMetrix, hCaptcha, reCAPTCHA
When a website changes its DOM structure, traditional selectors break. Agent-OS remembers element fingerprints and relocates them automatically:
1. Find element with CSS selector β β
Found β Save fingerprint (tag, attrs, text, path, parent)
2. Website redesigns, selector breaks β β Not found
3. Load stored fingerprint β Scan all page elements β Score similarity
4. Best match above 40% threshold β β
Element relocated!
Fingerprint components:
| Component | Weight | What it captures |
|---|---|---|
| Tag name | 30% | div, span, a, etc. |
| Attributes | 30% | class, id, name, href |
| Text content | 20% | Inner text (survives minor changes) |
| DOM path | 10% | Tag chain from root |
| Parent context | 10% | Parent tag + attributes |
Commands:
# Find element adaptively
{"command": "adaptive-find", "selector": ".product-title", "identifier": "product-name"}
# Save element fingerprint manually
{"command": "adaptive-save", "selector": "#login-btn", "identifier": "login-button"}
# View stored fingerprints
{"command": "adaptive-stats"}
# Clean old fingerprints
{"command": "adaptive-cleanup", "max_age_days": 30}Thread-safe proxy rotator with 4 strategies:
| Strategy | How it works | Best for |
|---|---|---|
| Cyclic | Sequential round-robin | General scraping |
| Weighted | Higher weight = more requests | Premium vs budget proxies |
| Random | Random selection | Anti-pattern detection |
| Sticky | Same proxy per domain | Session-based scraping |
Health tracking: Success rate, latency, consecutive failures. Unhealthy proxies auto-skipped with failover.
from src.tools.proxy_rotator import ProxyRotator
rotator = ProxyRotator(
proxies=["http://proxy1:8080", "http://proxy2:8080", "http://proxy3:8080"],
strategy="weighted"
)
proxy = rotator.get_proxy() # Get next proxy
proxy = rotator.get_proxy(domain="google.com") # Sticky per domain
proxy = rotator.get_proxy(country="US") # Geo-targeted
rotator.record_result(proxy_id, success=True, latency_ms=120)203 tools across 15 categories:
| Category | Tools | Highlights |
|---|---|---|
| Navigation | 6 | navigate, smart-navigate (auto HTTP/browser) |
| Interaction | 17 | click, fill-form, drag-drop, scroll |
| Smart Finder | 4 | Find by visible text β no CSS selectors |
| Content | 9 | screenshot, get-dom, evaluate-js |
| Page Analysis | 9 | page-seo, page-emails, page-accessibility |
| Network | 8 | Capture XHR, export HAR |
| Security | 3 | scan-xss, scan-sqli, scan-sensitive |
| Workflows | 6 | Multi-step automation with variables |
| Sessions | 8 | Save/restore cookies, auto-login |
| Proxy | 18 | Pool management, health checks, rotation |
| Adaptive | 4 | Element fingerprinting + relocation |
| Smart Wait | 7 | 7 wait strategies |
| Auto-Heal | 10 | Self-healing selectors |
| Auto-Retry | 10 | Circuit breaker + exponential backoff |
| Recording | 18 | Record, replay, export workflows |
| Multi-Agent | 19 | Shared sessions, task queues, locks |
| Login Handoff | 8 | Pause AI β human logs in β resume |
| LLM | 7 | Built-in llm-complete, llm-summarize |
| AI Content | 6 | Structured extraction, schema.org |
| CAPTCHA | 6 | Preempt, solve, monitor |
| TLS HTTP | 4 | Chrome TLS fingerprint without browser |
3-layer auth system:
# Layer 1: JWT (recommended)
curl -X POST http://localhost:8001/auth/register \
-H "Content-Type: application/json" \
-d '{"email":"you@example.com","username":"admin","password":"StrongPass123!"}'
curl -X POST http://localhost:8001/auth/login \
-H "Content-Type: application/json" \
-d '{"username":"admin","password":"StrongPass123!"}'
# Layer 2: API Keys
curl -X POST http://localhost:8001/auth/api-keys \
-H "Authorization: Bearer YOUR_JWT" \
-d '{"name":"my-key","scopes":["browser"]}'
# Layer 3: Legacy Tokens (dev only)
python3 main.py --agent-token "dev-token"ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β External Clients β
β Claude Desktop β GPT-4 β Codex β CLI β HTTP/WS β
ββββββββββ¬βββββββββ΄ββββ¬ββββββ΄ββββ¬ββββ΄βββ¬βββ΄βββββββ¬ββββββββββββββ
β β β β β
βΌ βΌ βΌ βΌ βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Connectors (203 tools each) β
β MCP β OpenAI β Claude β OpenClaw β CLI β REST+WebSocket β
ββββββββββ¬βββββββ΄ββββ¬ββββββ΄βββββ¬ββββββ΄βββ¬βββ΄βββββββ¬ββββββββββββ
ββββββββββββ΄βββββ¬ββββββ΄βββββββββ΄ββββββββββ
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Agent Server (aiohttp) β
β Auth β Rate Limiter β Validator β Command Router β
ββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
ββββββββββββΌβββββββββββ
βΌ βΌ βΌ
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β Browser β β Tools Layer β β Infrastructureβ
β (Patchright β β Adaptive β β PostgreSQL β
β + Stealth) β β Auto-Heal β β Redis β
β 26+ vectors β β Workflows β β JWT Auth β
β β β LLM Provider β β Logging β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
# 1. Set JWT secret
export JWT_SECRET_KEY=$(python3 -c 'import secrets; print(secrets.token_urlsafe(48))')
# 2. Start with production flags
python3 main.py \
--agent-token "strong-random-token" \
--port 8000 \
--max-ram 500 \
--json-logs
# 3. Verify
curl http://localhost:8001/healthexport POSTGRES_PASSWORD="strong-db-password"
docker compose --profile with-nginx up -d| Config | Concurrent Users | Memory |
|---|---|---|
| 1 instance Γ 50 contexts | 50 | ~800 MB |
| 3 instances Γ 50 contexts | 150 | ~2.4 GB |
| 5 instances Γ 50 contexts | 250 | ~4 GB |
Agent-OS/
βββ main.py # Entry point
βββ install.sh # One-command installer
βββ docker-compose.yml # Full Docker stack
βββ requirements.txt # Python dependencies
β
βββ src/
β βββ core/ # Browser engine
β β βββ browser.py # Main browser (Patchright/Chromium)
β β βββ stealth.py # Anti-detection JS (1264 lines)
β β βββ cdp_stealth.py # CDP-level stealth
β β βββ stealth_god.py # GOD MODE (26+ vectors)
β β βββ llm_provider.py # 12 LLM providers
β β βββ config.py # YAML configuration
β β
β βββ tools/ # Feature engines
β β βββ adaptive_scraper.py # β Adaptive element relocation
β β βββ proxy_rotator.py # β 4-strategy proxy rotation
β β βββ auto_heal.py # Self-healing selectors
β β βββ workflow.py # Multi-step workflows
β β βββ session_recording.py # Record & replay
β β βββ ... # 15+ more tools
β β
β βββ security/ # Stealth & evasion
β β βββ evasion_engine.py # Fingerprint generation
β β βββ captcha_solver.py # CAPTCHA solving
β β βββ cloudflare_bypass.py # Cloudflare bypass
β β
β βββ agents/
β βββ server.py # WebSocket + HTTP (202 commands)
β
βββ connectors/ # AI Platform Connectors
β βββ _tool_registry.py # 203 tool definitions
β βββ mcp_server.py # MCP (Claude/Codex)
β βββ openai_connector.py # OpenAI function-calling
β
βββ tests/ # Test suite
| Component | Technology |
|---|---|
| Browser | Patchright (stealth Playwright) + Chromium |
| HTTP Client | curl_cffi (Chrome TLS fingerprint) |
| Database | PostgreSQL (SQLAlchemy async) |
| Cache | Redis (with in-memory fallback) |
| Auth | JWT (HS256) + API keys |
| Validation | Pydantic v2 |
| Logging | structlog |
| Runtime | Python 3.10+ / asyncio |
git clone https://github.com/factspark23-hash/Agent-OS.git
cd Agent-OS
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
python3 -m patchright install chromium
# Run tests
python3 -m pytest tests/ -v
# Start dev server
python3 main.py --headed --debug --agent-token "dev-token"| Problem | Solution |
|---|---|
| Port in use | python3 main.py --port 9000 |
| Chromium not found | python3 -m patchright install chromium |
| JWT warning | export JWT_SECRET_KEY=$(python3 -c 'import secrets; print(secrets.token_urlsafe(48))') |
| Site detects bot | Try --device iphone_14 or add --proxy |
| High RAM | python3 main.py --max-ram 500 |
MIT License β free for commercial and personal use.
- Scrapling by Karim Shoair β Adaptive scraping algorithm and proxy rotation engine. Used under BSD 3-Clause License.