Skip to content

FabianWesth/longevity-agents

Repository files navigation

Antibody-AI (ABAI)

AI-powered multi-agent system for selecting optimal gene markers for spatial proteomics antibody panels from RNA-seq differential expression and pathway activity data.

⚠️ IMPORTANT: FOR RESEARCH USE ONLY
This software is intended solely for research purposes and does not provide medical advice, diagnosis, or treatment recommendations. Results must be validated by qualified researchers before any experimental or clinical application.

Overview

ABAI transforms bulk RNA-seq data into actionable protein marker recommendations for spatial proteomics experiments. It uses a coordinated multi-agent AI workflow to:

  1. Select promising genes from differentially expressed pathways
  2. Evaluate each gene's biological relevance, antibody availability, and technical feasibility
  3. Rank genes comparatively across pathways
  4. Generate a final deduplicated marker panel with detailed rationale

The system integrates multiple biological knowledge sources (Human Protein Atlas, Reactome, Europe PMC) to provide evidence-based marker recommendations.

Requirements

  • Python: 3.12 or higher
  • API Key: OpenAI API key (or Groq for alternative models)
  • System: macOS, Linux, or WSL2 on Windows

Installation

# Clone the repository
cd longevity-agents

# Install dependencies with uv (recommended)
uv sync

# Or with pip
pip install -e .

Configuration

1. Set up API keys

Create a .env file in the project root:

# Required if using OpenAI models (recommended)
OPENAI_API_KEY=your-openai-api-key-here

# Optional: Required only if using Groq models
GROQ_API_KEY=your-groq-api-key-here

2. Adjust workflow settings (optional)

Edit config.yaml to customize:

workflow:
  num_pathways: 3              # Number of top pathways to analyze
  markers_per_pathway: 3       # Genes to evaluate per pathway
  top_final_markers: 5         # Final marker count in output

models:
  orchestrator: "openai:gpt-4.1-mini"
  marker_evaluation: "openai:gpt-5-nano"
  ranking: "openai:gpt-4.1-mini"
  final_selection: "openai:gpt-4.1-mini"
  summarizer: "openai:gpt-5-nano"

Usage

Option 1: Command Line

Run the complete workflow:

# Using uv
uv run python main.py

# Or if installed as package
longevity-agents

Option 2: Web Interface

Start the API and web UI:

# Start both services
./scripts/start_all.sh

# Or start separately:
./scripts/start_api.sh      # API at http://localhost:8000
./scripts/start_frontend.sh # UI at http://localhost:8501

Access the web interface at http://localhost:8501 to:

  • Upload custom input files
  • Configure workflow parameters
  • Monitor execution progress
  • View and export results

Option 3: Python API

from longevity_agents import SpatialProteomicsWorkflow
from longevity_agents.config import get_settings

# Use default settings from config.yaml and .env
workflow = SpatialProteomicsWorkflow()
await workflow.run()

# Or with custom settings
from pathlib import Path
from longevity_agents.config import Settings

settings = Settings.from_yaml(Path("custom_config.yaml"))
workflow = SpatialProteomicsWorkflow(settings)
await workflow.run()

Input Data

Place three CSV files in the input/ directory:

File Description Required Columns
pathway_activity_results.csv Pathway activity scores from PROGENy, GSEA, or similar pathway, score, p_value
pathway_gene_map.csv Pathway-gene associations with weights pathway, gene, weight
deg.csv Differential expression results (DESeq2, edgeR, etc.) gene, log2FoldChange, pvalue, padj

Example formats:

# pathway_activity_results.csv
pathway,score,p_value
TNFA_SIGNALING_VIA_NFKB,5.23,0.001
G2M_CHECKPOINT,3.87,0.005

# pathway_gene_map.csv
pathway,gene,weight
TNFA_SIGNALING_VIA_NFKB,IL1B,0.89
TNFA_SIGNALING_VIA_NFKB,CXCL8,0.76

# deg.csv
gene,log2FoldChange,pvalue,padj,baseMean
IL1B,2.34,0.0001,0.001,1234.5
CXCL8,1.87,0.0005,0.003,987.2

Output

Results are saved to output/YYYYMMDD_HHMMSS/:

output/20251116_103301/
├── config.json                        # Run configuration
├── workflow_summary.txt               # High-level summary
├── final_marker_selection.json        # Final ranked markers (JSON)
├── final_marker_selection.csv         # Final ranked markers (CSV)
└── {PATHWAY_NAME}/
    ├── {GENE}_report.json            # Detailed gene evaluation
    └── {PATHWAY}_ranking_report.txt  # Pathway-level rankings

Key output files:

  • final_marker_selection.json: Top-ranked markers with scores, rationale, and technical assessments
  • {GENE}_report.json: Comprehensive evaluation including:
    • Literature evidence
    • Protein expression patterns
    • Antibody availability
    • Phosphorylation potential
    • Technical feasibility score

Project Structure

longevity-agents/
├── src/longevity_agents/     # Core package
│   ├── agents/               # AI agent definitions
│   ├── config/               # Settings and instructions
│   ├── data/                 # Data loading and processing
│   ├── models/               # Data models (Pydantic)
│   ├── tools/                # Agent tools for data access
│   ├── utils/                # Logging, rate limiting, exceptions
│   └── workflow/             # Main pipeline orchestration
├── api/                      # FastAPI backend
├── frontend/                 # Streamlit web interface
├── input/                    # Input CSV files
├── output/                   # Generated results
├── config.yaml               # Application configuration
├── .env                      # API keys (not tracked)
└── main.py                   # CLI entry point

Configuration Options

Workflow Parameters

Parameter Default Description
num_pathways 3 Top pathways to analyze
markers_per_pathway 3 Genes to evaluate per pathway
max_gene_selection 20 Max genes if pathway exceeds markers_per_pathway
top_final_markers 5 Final deduplicated markers
yap_level 4 Agent verbosity (1-10)

Model Configuration

Each agent can use a different LLM model. Format: provider:model-name

Supported providers:

  • openai:gpt-4.1, openai:gpt-4.1-mini, openai:gpt-5-nano
  • groq:llama-3.1-70b-versatile, groq:mixtral-8x7b-32768

Rate Limiting

rate_limiting:
  delay: 1.0              # Seconds between API requests
  max_retries: 5          # Max retry attempts
  backoff_factor: 2.0     # Exponential backoff multiplier

Cost Estimation

Approximate costs per run (using OpenAI GPT-4.1-mini/5-nano):

  • 3 pathways, 3 markers each: ~$0.50-1.00
  • 5 pathways, 5 markers each: ~$1.50-3.00

Actual costs depend on:

  • Number of pathways and genes
  • Literature search depth
  • Model selection
  • Token usage per evaluation

Troubleshooting

Issue: ModuleNotFoundError: No module named 'longevity_agents'
Solution: Run uv pip install -e . or use python main.py which adds src to path

Issue: API rate limit errors
Solution: Increase rate_limiting.delay in config.yaml

Issue: Web UI not connecting to API
Solution: Ensure API is running at http://localhost:8000 before starting frontend

Issue: Empty or missing output files
Solution: Check logs for errors; verify input files have correct format and columns

Logging and Observability

Logs are written to console with configurable levels:

logging:
  level: "INFO"              # DEBUG, INFO, WARNING, ERROR, CRITICAL
  logfire_enabled: true      # Enable Logfire observability (requires account)

Disclaimer

⚠️ RESEARCH USE ONLY - NOT FOR MEDICAL USE

This software is provided strictly for research purposes and has not been validated for clinical, diagnostic, or therapeutic use.

NOT MEDICAL ADVICE: This tool does not provide medical advice, diagnosis, treatment recommendations, or clinical guidance of any kind. It is a research tool for identifying potential protein markers in spatial proteomics experiments.

Required validations before use:

  1. Expert Review: All AI-generated recommendations must be validated by qualified domain experts
  2. Literature Verification: Results should be reviewed against current peer-reviewed scientific literature
  3. Experimental Validation: Markers must be tested through appropriate validation experiments
  4. Antibody Validation: Always validate antibody specificity and performance in your experimental system before committing to large-scale spatial proteomics experiments

Limitations and Liability:

  • ❌ NOT validated for clinical or diagnostic use
  • ❌ NOT a substitute for professional scientific or medical judgment
  • ⚠️ No warranties regarding accuracy or fitness for any purpose
  • ⚠️ Authors and contributors are not liable for any decisions made based on tool output
  • ⚠️ Automated literature analysis may miss recent publications or misinterpret context
  • ✅ Independent verification of all results is required

By using this software, you acknowledge that you understand these limitations and will not use it for medical decision-making or patient care.

License

Academic Software License (ASL) v1.0

Copyright © 2025 Malte Kuehl and Fabian Westhaeusser

This software is licensed under the Academic Software License v1.0 for non-commercial academic research and educational purposes only.

Summary:

  • ✅ Free for academic research and educational use
  • ✅ Modification and redistribution permitted (under same license)
  • ✅ Publication of research results (with proper citation)
  • ❌ Commercial use requires separate licensing agreement
  • ⚠️ Provided without warranty

See the LICENSE file for complete terms and conditions.

For commercial licensing inquiries, please contact the authors.

Citation

If you use this tool in your research, please cite:

Antibody-AI (ABAI): AI-Powered Spatial Proteomics Marker Selection
https://github.com/FabianWesth/longevity-agents

Support

For issues, questions, or feature requests, please open a GitHub issue.


Built with: Pydantic AI, FastMCP, OpenAI, Streamlit

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors