AI-powered multi-agent system for selecting optimal gene markers for spatial proteomics antibody panels from RNA-seq differential expression and pathway activity data.
⚠️ IMPORTANT: FOR RESEARCH USE ONLY
This software is intended solely for research purposes and does not provide medical advice, diagnosis, or treatment recommendations. Results must be validated by qualified researchers before any experimental or clinical application.
ABAI transforms bulk RNA-seq data into actionable protein marker recommendations for spatial proteomics experiments. It uses a coordinated multi-agent AI workflow to:
- Select promising genes from differentially expressed pathways
- Evaluate each gene's biological relevance, antibody availability, and technical feasibility
- Rank genes comparatively across pathways
- Generate a final deduplicated marker panel with detailed rationale
The system integrates multiple biological knowledge sources (Human Protein Atlas, Reactome, Europe PMC) to provide evidence-based marker recommendations.
- Python: 3.12 or higher
- API Key: OpenAI API key (or Groq for alternative models)
- System: macOS, Linux, or WSL2 on Windows
# Clone the repository
cd longevity-agents
# Install dependencies with uv (recommended)
uv sync
# Or with pip
pip install -e .Create a .env file in the project root:
# Required if using OpenAI models (recommended)
OPENAI_API_KEY=your-openai-api-key-here
# Optional: Required only if using Groq models
GROQ_API_KEY=your-groq-api-key-hereEdit config.yaml to customize:
workflow:
num_pathways: 3 # Number of top pathways to analyze
markers_per_pathway: 3 # Genes to evaluate per pathway
top_final_markers: 5 # Final marker count in output
models:
orchestrator: "openai:gpt-4.1-mini"
marker_evaluation: "openai:gpt-5-nano"
ranking: "openai:gpt-4.1-mini"
final_selection: "openai:gpt-4.1-mini"
summarizer: "openai:gpt-5-nano"Run the complete workflow:
# Using uv
uv run python main.py
# Or if installed as package
longevity-agentsStart the API and web UI:
# Start both services
./scripts/start_all.sh
# Or start separately:
./scripts/start_api.sh # API at http://localhost:8000
./scripts/start_frontend.sh # UI at http://localhost:8501Access the web interface at http://localhost:8501 to:
- Upload custom input files
- Configure workflow parameters
- Monitor execution progress
- View and export results
from longevity_agents import SpatialProteomicsWorkflow
from longevity_agents.config import get_settings
# Use default settings from config.yaml and .env
workflow = SpatialProteomicsWorkflow()
await workflow.run()
# Or with custom settings
from pathlib import Path
from longevity_agents.config import Settings
settings = Settings.from_yaml(Path("custom_config.yaml"))
workflow = SpatialProteomicsWorkflow(settings)
await workflow.run()Place three CSV files in the input/ directory:
| File | Description | Required Columns |
|---|---|---|
pathway_activity_results.csv |
Pathway activity scores from PROGENy, GSEA, or similar | pathway, score, p_value |
pathway_gene_map.csv |
Pathway-gene associations with weights | pathway, gene, weight |
deg.csv |
Differential expression results (DESeq2, edgeR, etc.) | gene, log2FoldChange, pvalue, padj |
Example formats:
# pathway_activity_results.csv
pathway,score,p_value
TNFA_SIGNALING_VIA_NFKB,5.23,0.001
G2M_CHECKPOINT,3.87,0.005
# pathway_gene_map.csv
pathway,gene,weight
TNFA_SIGNALING_VIA_NFKB,IL1B,0.89
TNFA_SIGNALING_VIA_NFKB,CXCL8,0.76
# deg.csv
gene,log2FoldChange,pvalue,padj,baseMean
IL1B,2.34,0.0001,0.001,1234.5
CXCL8,1.87,0.0005,0.003,987.2Results are saved to output/YYYYMMDD_HHMMSS/:
output/20251116_103301/
├── config.json # Run configuration
├── workflow_summary.txt # High-level summary
├── final_marker_selection.json # Final ranked markers (JSON)
├── final_marker_selection.csv # Final ranked markers (CSV)
└── {PATHWAY_NAME}/
├── {GENE}_report.json # Detailed gene evaluation
└── {PATHWAY}_ranking_report.txt # Pathway-level rankings
Key output files:
final_marker_selection.json: Top-ranked markers with scores, rationale, and technical assessments{GENE}_report.json: Comprehensive evaluation including:- Literature evidence
- Protein expression patterns
- Antibody availability
- Phosphorylation potential
- Technical feasibility score
longevity-agents/
├── src/longevity_agents/ # Core package
│ ├── agents/ # AI agent definitions
│ ├── config/ # Settings and instructions
│ ├── data/ # Data loading and processing
│ ├── models/ # Data models (Pydantic)
│ ├── tools/ # Agent tools for data access
│ ├── utils/ # Logging, rate limiting, exceptions
│ └── workflow/ # Main pipeline orchestration
├── api/ # FastAPI backend
├── frontend/ # Streamlit web interface
├── input/ # Input CSV files
├── output/ # Generated results
├── config.yaml # Application configuration
├── .env # API keys (not tracked)
└── main.py # CLI entry point
| Parameter | Default | Description |
|---|---|---|
num_pathways |
3 | Top pathways to analyze |
markers_per_pathway |
3 | Genes to evaluate per pathway |
max_gene_selection |
20 | Max genes if pathway exceeds markers_per_pathway |
top_final_markers |
5 | Final deduplicated markers |
yap_level |
4 | Agent verbosity (1-10) |
Each agent can use a different LLM model. Format: provider:model-name
Supported providers:
openai:gpt-4.1,openai:gpt-4.1-mini,openai:gpt-5-nanogroq:llama-3.1-70b-versatile,groq:mixtral-8x7b-32768
rate_limiting:
delay: 1.0 # Seconds between API requests
max_retries: 5 # Max retry attempts
backoff_factor: 2.0 # Exponential backoff multiplierApproximate costs per run (using OpenAI GPT-4.1-mini/5-nano):
- 3 pathways, 3 markers each: ~$0.50-1.00
- 5 pathways, 5 markers each: ~$1.50-3.00
Actual costs depend on:
- Number of pathways and genes
- Literature search depth
- Model selection
- Token usage per evaluation
Issue: ModuleNotFoundError: No module named 'longevity_agents'
Solution: Run uv pip install -e . or use python main.py which adds src to path
Issue: API rate limit errors
Solution: Increase rate_limiting.delay in config.yaml
Issue: Web UI not connecting to API
Solution: Ensure API is running at http://localhost:8000 before starting frontend
Issue: Empty or missing output files
Solution: Check logs for errors; verify input files have correct format and columns
Logs are written to console with configurable levels:
logging:
level: "INFO" # DEBUG, INFO, WARNING, ERROR, CRITICAL
logfire_enabled: true # Enable Logfire observability (requires account)This software is provided strictly for research purposes and has not been validated for clinical, diagnostic, or therapeutic use.
NOT MEDICAL ADVICE: This tool does not provide medical advice, diagnosis, treatment recommendations, or clinical guidance of any kind. It is a research tool for identifying potential protein markers in spatial proteomics experiments.
Required validations before use:
- Expert Review: All AI-generated recommendations must be validated by qualified domain experts
- Literature Verification: Results should be reviewed against current peer-reviewed scientific literature
- Experimental Validation: Markers must be tested through appropriate validation experiments
- Antibody Validation: Always validate antibody specificity and performance in your experimental system before committing to large-scale spatial proteomics experiments
Limitations and Liability:
- ❌ NOT validated for clinical or diagnostic use
- ❌ NOT a substitute for professional scientific or medical judgment
⚠️ No warranties regarding accuracy or fitness for any purpose⚠️ Authors and contributors are not liable for any decisions made based on tool output⚠️ Automated literature analysis may miss recent publications or misinterpret context- ✅ Independent verification of all results is required
By using this software, you acknowledge that you understand these limitations and will not use it for medical decision-making or patient care.
Academic Software License (ASL) v1.0
Copyright © 2025 Malte Kuehl and Fabian Westhaeusser
This software is licensed under the Academic Software License v1.0 for non-commercial academic research and educational purposes only.
Summary:
- ✅ Free for academic research and educational use
- ✅ Modification and redistribution permitted (under same license)
- ✅ Publication of research results (with proper citation)
- ❌ Commercial use requires separate licensing agreement
⚠️ Provided without warranty
See the LICENSE file for complete terms and conditions.
For commercial licensing inquiries, please contact the authors.
If you use this tool in your research, please cite:
Antibody-AI (ABAI): AI-Powered Spatial Proteomics Marker Selection
https://github.com/FabianWesth/longevity-agents
For issues, questions, or feature requests, please open a GitHub issue.
Built with: Pydantic AI, FastMCP, OpenAI, Streamlit