π‘οΈ A curated list of resources on securing AI agent tool use and skill ecosystems β attacks, defenses, frameworks, benchmarks, and standards.
AI agents increasingly use external tools, plugins, and skills to interact with the world. This creates a new attack surface: agent skills security. This list covers the threats, defenses, and research landscape for securing these capabilities.
- Threat Frameworks & Standards
- Surveys & Systematizations
- Attack Research
- Defense Research
- Benchmarks & Datasets
- Tools & Frameworks
- Agent Skill Specifications
- Industry Reports & Blog Posts
- Related Awesome Lists
- Contributing
- OWASP Agentic AI Threats and Mitigations β First in a series from the OWASP Agentic Security Initiative (ASI), providing threat-model-based reference for agentic threats.
- OWASP Top 10 for LLM Applications β Includes LLM01: Prompt Injection, LLM06: Excessive Agency, LLM07: Insecure Plugin Design, LLM08: Excessive Autonomy.
- MITRE ATLASβ’ β Adversarial Threat Landscape for AI Systems. Tactics, techniques, and case studies for attacks on ML/AI systems.
- NIST AI Risk Management Framework β Federal framework for managing AI risks, including autonomous agent risks.
- NIST SP 800-218A: Secure Software Development for AI β Secure development practices specific to AI-enabled systems.
- EU AI Act β European regulation with specific provisions for high-risk AI systems including autonomous agents.
- Anthropic Responsible Scaling Policy β AI Safety Levels (ASL) framework addressing agent capability thresholds.
- IETF draft-klrc-aiagent-auth-01: AI Agent Authentication and Authorization β Kasselman et al., IETF WIMSE-adjacent, 2026. Proposes a model for authentication and authorization of AI agent interactions using existing OAuth 2.0 and WIMSE standards; covers delegation chains, agent identity, and trust establishment without defining new protocols.
- IETF draft-niyikiza-oauth-attenuating-agent-tokens-00: Attenuating Authorization Tokens for Agentic Delegation Chains β Niyikiza (Tenuo), OAuth WG, March 2026. Defines Attenuating Authorization Tokens (AATs): JWT-based credentials encoding tool-level argument constraints with a cryptographically enforced monotonic attenuation invariant β any holder can derive a more restrictive token but never a more permissive one. Extends Rich Authorization Requests (RFC 9396) with delegation-chain semantics.
- π A Survey on LLM-based Autonomous Agents: Common Attacks and Defenses β Wu et al., 2024. Comprehensive taxonomy of attacks on LLM agents across perception, cognition, and action stages.
- π Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents β Zhang et al., 2024. Formalization of 10 attack scenarios, 10 agents, 398 adversarial environments.
- π Security of AI Agents β He et al., 2024. Systematization of knowledge covering threat models for AI agents with tool access.
- π Not All Agents Are Created Equal: A Survey on Software-use Agent Security β Hua et al., 2025. Survey specifically on software-use agents and their unique security challenges.
- π A Survey on the Honesty of Large Language Models β Xie et al., 2024. Covers agent deception, sycophancy, and honesty in tool-use contexts.
- π A Comprehensive Study of Jailbreak Attack versus Defense for Large Language Models β Xu et al., 2024. Systematic study of jailbreak attacks relevant to agent guardrail bypass.
- π Prompt Injection Attacks and Defenses in LLM-Integrated Applications β Liu et al., 2024. Comprehensive taxonomy of injection attacks across direct and indirect vectors.
- π The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies β Gan et al., 2024. Survey with practical case studies of security failures.
- π Self-Evolving Agents: A Survey β Gao et al., 2025. How self-evolving agents create emergent security risks.
- π From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents β Zhang et al., 2026. Hierarchical Autonomy Evolution (HAE) framework organizing agent security into cognitive, execution, and societal tiers.
- π Characterizing Faults in Agentic AI: A Taxonomy of Types, Symptoms, and Root Causes β Shah et al., 2026. Empirical taxonomy of reliability failures in agentic AI systems combining LLM reasoning with tool invocation.
- π Security Considerations for Multi-agent Systems β 2026. Systematic threat landscape of MAS with 193 threat items across 9 categories; evaluates 16 frameworks finding none achieves majority coverage.
- π The Attack and Defense Landscape of Agentic AI: A Comprehensive Survey β Kim et al., USENIX Security 2026. First systematic survey of AI agent security covering design space, attack landscape, and defense mechanisms with case studies on securing agentic systems.
- π Taming OpenClaw: Security Analysis and Mitigation of Autonomous LLM Agent Threats β Deng et al., 2026. Five-layer lifecycle-oriented security framework analyzing compound threats across initialization, input, inference, decision, and execution stages of autonomous LLM agents.
- π AgenticCyOps: Securing Multi-Agentic AI Integration in Enterprise Cyber Operations β Mitra et al., 2026. Holistic architectural security framework decomposing attack surfaces across component, coordination, and ecosystem layers of enterprise multi-agent systems.
- π MCP-in-SoS: Risk Assessment Framework for Open-Source MCP Servers β Kumar et al., 2026. System-of-systems risk assessment framework for evaluating security risks of open-source MCP server deployments in production agent systems.
- π SoK: The Attack Surface of Agentic AI β Tools, and Autonomy β Dehghantanha & Homayoun, 2026. Systematization mapping trust boundaries and security risks of agentic LLM systems; proposes taxonomy spanning prompt injection, RAG poisoning, tool exploits, and multi-agent threats with metrics like Unsafe Action Rate and Privilege Escalation Distance.
- π Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection β Greshake et al., AISec 2023. Foundational work on indirect prompt injection through tool outputs.
- π Inject My PDF: Prompt Injection for Your Resume β Practical injection through document processing tools.
- π InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated LLM Agents β Zhan et al., ACL 2024. 1,054 test cases, 17 tools, two attack types (direct harm, data stealing).
- π Automatic and Universal Prompt Injection Attacks against Large Language Models β Liu et al., 2024. Automated generation of injection attacks.
- π WIPI: A New Web Threat for LLM-Driven AI Agents β Liu et al., 2024. Web-based indirect prompt injection targeting browsing agents.
- π Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search β Shen et al., 2026. Tree-structured Injection for Payloads (TIP): black-box attack generating natural-language payloads to seize control of MCP-enabled agents; achieves >95% attack success in undefended settings and >50% against four defense approaches with an order of magnitude fewer queries than prior adaptive attacks.
- π Model Context Protocol Threat Modeling and Analyzing Vulnerabilities to Prompt Injection with Tool Poisoning β Huang et al., 2026. STRIDE/DREAD threat modeling of MCP across five components; systematic comparison of tool poisoning defenses in seven major MCP clients reveals insufficient static validation; proposes multi-layered defense strategy.
- π Are AI-assisted Development Tools Immune to Prompt Injection? β Huang et al., 2026. First empirical analysis of prompt injection via tool-poisoning across seven MCP clients (Claude Desktop, Claude Code, Cursor, Cline, Continue, Gemini CLI, Langflow); reveals significant security disparities with Cursor most susceptible.
- π Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks β Schmotz et al., 2026. Benchmark measuring agent vulnerability to malicious skill/config files; demonstrates data exfiltration, destructive actions, and ransomware-like behavior via AGENTS.md/CLAUDE.md injection.
- π ToolSword: Unveiling Safety Issues of LLMs in Tool Learning Across Three Stages β Ye et al., ACL 2024. Identifies safety issues across tool selection, tool calling, and result handling.
- π Compromising Agents via MCP β Invariant Labs, 2025. Tool poisoning attacks via Model Context Protocol servers.
- π Osmosis Distillation: Model Hijacking with the Fewest Samples β Shi et al., 2026. Supply-chain attack via poisoned synthetic training data.
- π Personality Self-Replicators β 2026. Agent personality files as self-replicating genetic material.
- π MCP Security Notification: Tool Poisoning Attacks β Official MCP security advisory on tool description poisoning.
- π Invariant Labs: MCP Security Research β Analysis of cross-tool contamination, rug pulls, and tool shadowing via MCP.
- π Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents β Yang et al., NeurIPS 2024. Backdoor attacks on agent reasoning and tool calling.
- π Pandora's White-Box: Precise Training Data Detection and Extraction in Large Language Models β Relevant to agents leaking training data through tool outputs.
- π R-Judge: Benchmarking Safety Risk Awareness for LLM Agents β Yuan et al., ACL 2024. 162 records across 27 risk scenarios for evaluating agent safety awareness.
- π TrustAgent: Towards Safe and Trustworthy LLM-based Agents β Zhang et al., 2024. Agent-constitution-based approach to limiting excessive agency.
- π Post-Training Local LLM Agents for Linux Privilege Escalation with Verifiable Rewards β Normann et al., 2026. Two-stage post-training pipeline (SFT + RL with verifiable rewards) producing a 4B model that achieves 95.8% success on privilege escalation, nearly matching Claude Opus 4.6 at 100Γ lower inference cost.
- π AgentSCOPE: Evaluating Contextual Privacy Across Agentic Workflows β Ngong et al., 2026. 80%+ of agentic pipelines leak private data at intermediate stages.
- π Silent Egress: LLM-Driven Data Exfiltration via Steganographic Channels β Demonstrates covert channels for data theft through agent outputs.
- π Privacy Risks of General-Purpose AI Systems: A Foundation for Investigating Practitioner Perspectives β GPAIS privacy risks including agent data handling.
- π IMMACULATE: A Framework for Analyzing Information Exposure in Agent-Based Systems β Multi-turn agent information leakage analysis.
- π AgentRaft: Automated Detection of Data Over-Exposure in LLM Agents β Lin et al., 2026. Automated detection framework for identifying data over-exposure vulnerabilities in LLM agent integrations.
- π You Told Me to Do It: Measuring Instructional Text-induced Private Data Leakage in LLM Agents β Kao et al., 2026. Identifies the Trusted Executor Dilemma where high-privilege agents execute adversarial README instructions at up to 85% success rate; 0% human detection rate across 15 participants.
- π Differential Privacy in Generative AI Agents: Analysis and Optimal Tradeoffs β Yang & Zhu, 2026. Probabilistic framework for analyzing privacy leakage in AI agents via differential privacy, deriving token-level and message-level bounds relating leakage to temperature and message length.
- π AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations β He et al., 2026. Defense against indirect prompt injection using causal attribution to trace which tool outputs triggered suspicious agent actions.
- π Adaptive Attacks and Defenses Against Indirect Prompt Injection β Chen et al., 2024. Adaptive attackers bypassing static defenses.
- π HouYi: A Black-box Prompt Injection Attack on LLM-integrated Applications β Liu et al., 2023. Systematic methodology for finding injection vulnerabilities.
- π DMAST: Dual-Modality Multi-Stage Adversarial Safety Training β Liu et al., 2026. Cross-modal DOM injection corrupting both visual and text channels.
- π Thought Virus: Viral Misalignment via Subliminal Prompting in Multi-Agent Systems β Weckbecker et al., 2026. Single subliminally prompted agent spreads persistent bias through entire multi-agent network, degrading truthfulness of other agents.
- π Intentional Deception as Controllable Capability in LLM Agents β Starace & Soule, 2026. Systematic study of engineered deception in multi-agent LLM interactions using 36 behavioral profiles for defensive design.
- π Cascade: Composing Software-Hardware Attack Gadgets for Adversarial Threat Amplification in Compound AI Systems β Banerjee et al., 2026. Demonstrates novel attacks combining traditional software/hardware vulnerabilities (code injection, Rowhammer) with LLM-specific algorithmic weaknesses to compromise compound AI pipelines.
- π When LLM-based Code Generation Meets the Software Supply Chain β Supply chain risks from LLM-generated code integrating malicious packages.
- π Shadow API: Covert Data Exfiltration via LLM-Mediated API Interactions β 2026. Stealth data theft through seemingly benign API calls.
- π AgentSkillOS: Towards Secure and Composable Agent Skill Operating Systems β 2026. OS-level isolation for agent skill execution.
- π SlowBA: An efficiency backdoor attack towards VLM-based GUI agents β Li et al., 2026. Novel backdoor targeting response latency of GUI agents via trigger-activated long reasoning chains.
- π Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs β Panfilov et al., 2026. Autonomous autoresearch pipeline powered by Claude Code discovers novel white-box adversarial attack algorithms that significantly outperform all existing 30+ methods in jailbreaking and prompt injection evaluations.
- π Jailbreaking ChatGPT via Prompt Engineering β Liu et al., 2023. Foundational jailbreaking taxonomy. 700+ citations.
- π MasterKey: Automated Jailbreaking of Large Language Model Chatbots β Deng et al., NDSS 2024. Automated time-based jailbreak generation.
- π PentestGPT: An LLM-empowered Automatic Penetration Testing Tool β Deng et al., 2023. Demonstrates agent-level tool use for offensive security. 11K+ GitHub stars.
- π Self-Fulfilling Misalignment in AI Control β 2026. Fine-tuning on AI Control literature increases misalignment.
- π Reasoning Models Struggle to Control Their Chains of Thought β 2026. CoT controllability decreases with RL training, implications for agent oversight.
- π CRAFT: Contrastive Reasoning Alignment β Reinforcement Learning from Hidden Representations β Luo et al., 2026. Red-teaming alignment framework combining contrastive representation learning with RL to separate safe/unsafe reasoning trajectories; 79% improvement in reasoning safety and 87.7% in final-response safety over base models.
- π PAuth: Precise Task-Scoped Authorization For Agents β Sharma et al., 2026. Implicit authorization model where NL task submission authorizes only required operations; uses NL slices and envelopes for provenance-based server-side verification, blocking injected operations in AgentDojo with zero false positives.
- π TrustAgent: Towards Safe and Trustworthy LLM-based Agents β Agent Constitution for safety-aware planning with pre/post-action inspection.
- π A Dual-Helix Governance Approach for Reliable Agentic AI β 3-track architecture (Knowledge, Behavior, Skills) using knowledge graphs.
- π Talk Freely, Execute Strictly: Schema-Gated Agentic AI β Schema-gated orchestration for trustworthy agent deployment in regulated domains.
- π ESAA-Security: Event-Sourced Architecture for Agent-Assisted Security Audits β 26 tasks, 95 checks, append-only event logs for reproducible AI code audits.
- π Caging the Agents: A Zero Trust Security Architecture for Autonomous AI in Healthcare β Maiti, 2026. Production-deployed zero-trust architecture for 9 autonomous AI agents: gVisor kernel isolation, credential proxy sidecars, network egress allowlisting, and prompt integrity framework with untrusted content labeling. Open-source configs released.
- π Arbiter: Detecting Interference in LLM Agent System Prompts β Mason, 2026. Framework combining formal evaluation rules with multi-model LLM scouring to detect interference and vulnerability classes in agent system prompts.
- π MCPShield: A Security Cognition Layer for Adaptive Trust Calibration in MCP Agents β Zhou et al., 2026. Plug-in security cognition layer for MCP agents that validates third-party tool invocations via experience-driven trust calibration.
- π OpenClaw PRISM: A Zero-Fork, Defense-in-Depth Runtime Security Layer for Tool-Augmented LLM Agents β Li, 2026. Runtime security layer distributing enforcement across ten lifecycle hooks with hybrid heuristic-plus-LLM scanning, session-scoped risk accumulation, and tamper-evident audit for agent gateways.
- π Governing Evolving Memory in LLM Agents: Risks, Mechanisms, and the SSGM Framework β Lam et al., 2026. Stability and Safety-Governed Memory framework mitigating topology-induced knowledge leakage and semantic drift in persistent agent memory systems.
- π AgentSentry: Real-time Monitoring for Agentic AI Systems β Runtime behavioral monitoring of tool-using agents.
- π Monitoring Emergent Reward Hacking via Internal Activations β Sparse autoencoders detect reward-hacking during generation.
- π Self-Attribution Bias: When AI Monitors Go Easy on Themselves β AI monitors exhibit systematic leniency on own outputs.
- π Salient Directions in AI Control β Structure of AI Control evaluations: trusted monitors overseeing untrusted agents.
- π Governed Memory: A Production Architecture for Multi-Agent Workflows β Taheri, 2026. Shared memory governance layer with dual memory model, tiered governance routing, entity-scoped isolation (zero cross-entity leakage across 500 adversarial queries), and 100% adversarial governance compliance in production.
- π Behavioral Attestation and Compaction Drift in Persistent AI Agents β Morrow (agent-morrow), 2026. Identifies compaction drift β non-adversarial behavioral shift caused by context window compression β as a runtime integrity threat class distinct from adversarial injection. Proposes behavioral attestation (context fingerprint delta against a pre-compression baseline) as the mechanism for continuous rather than one-time agent authorization. Complements credential-scope enforcement (e.g., AATs) with runtime execution verification.
- π StruQ: Defending Against Prompt Injection with Structured Queries β Separates prompts from data to prevent injection.
- π Towards Provably Unbiased LLM Judges via Bias-Bounded Evaluation β Formal guarantees for reducing bias in LLM-as-judge systems.
- π Judge Reliability Harness: Stress Testing LLM Judges β No evaluated judge is uniformly reliable.
- π GELO: Good-Enough LLM Obfuscation β Privacy-preserving LLM inference with ~20-30% latency overhead.
- π Agentics 2.0: Logical Transduction Algebra for Agentic Data Workflows β Formalizes LLM inference as typed semantic transformations with algebraic composition.
- π Knowledge Divergence and the Value of Debate for Scalable Oversight β Formal framework for choosing oversight mechanisms.
- π Real-Time Trust Verification for Safe Agentic Actions using TrustBench β Sharma et al., AAAI 2026 Workshop on TrustAgent. Dual-mode framework benchmarking trust across multiple dimensions and providing a pre-execution action verification toolkit for agents.
- π Agent Security Bench (ASB) β 10 scenarios, 10 agents, 398 environments. Comprehensive agent security benchmark.
- π AgentDyn: A Dynamic Open-Ended Benchmark for Prompt Injection Attacks β Li et al., 2026. Dynamic, open-ended benchmark for evaluating indirect prompt injection defenses in real-world agent security systems.
- π NAAMSE: Framework for Evolutionary Security Evaluation of Agents β Pai et al., ICLR 2026 Workshop. Evolutionary framework reframing agent security evaluation as feedback-driven optimization with autonomous red-teaming.
- π R-Judge: Benchmarking Safety Risk Awareness β 162 records, 27 risk scenarios for agent safety.
- π InjecAgent: Benchmarking Indirect Prompt Injections β 1,054 test cases for tool-integrated agents.
- π SIABENCH: Evaluating LLMs for Security Incident Analysis β 11 LLMs Γ 160 security incident scenarios.
- π EVMbench: Evaluating AI Agents on Smart Contract Security β 117 vulnerabilities, frontier agents exploit end-to-end.
- π Ο-Knowledge: Evaluating Conversational Agents over Unstructured Knowledge β Even frontier models achieve only ~25.5% on complex agent tasks.
- π Interactive Benchmarks β Evaluating via interactive proofs and games instead of static benchmarks.
- π VeriGrey: Greybox Agent Validation β Zhang et al., 2026. Greybox security testing using tool-invocation sequences as feedback and mutational prompt fuzzing; 33% more effective than black-box on AgentDojo, discovers prompt injection scenarios missed by black-box in Gemini CLI and OpenClaw.
- π LAAF: Logic-layer Automated Attack Framework for Agentic LLM Systems β Atta et al., 2026. First automated red-teaming framework combining 49-technique LPCI taxonomy with stage-sequential seed escalation; 84% mean aggregate breakthrough rate across five production LLM platforms.
| Benchmark | Focus | Size | Paper |
|---|---|---|---|
| ASB | Comprehensive agent security | 10 agents, 398 envs | Zhang et al. |
| InjecAgent | Indirect prompt injection | 1,054 test cases | Zhan et al. |
| R-Judge | Safety risk awareness | 162 records, 27 scenarios | Yuan et al. |
| ToolSword | Tool learning safety | 6 scenarios, 3 stages | Ye et al. |
| AgentDyn | Dynamic prompt injection | Open-ended, extensible | Li et al. |
| Skill-Inject | Skill file attacks | Multi-scenario | Schmotz et al. |
| NAAMSE | Evolutionary agent security eval | Adaptive red-teaming | Pai et al. |
| AgentHarm | Agent misuse | 110 behaviors, 440 variants | Andriushchenko et al. |
| SkillGuard Dataset | Malicious skill detection | 157 malicious skills | Liu et al. |
| WIPI | Web-based indirect injection | Multi-scenario | Liu et al. |
| Tool | Description | Link |
|---|---|---|
| SkillGuard | LLM-native agent skill security auditor (OWASP Agentic + MITRE ATLAS) | |
| Invariant Guardrails | Policy-based agent security guardrails | |
| LLM Guard | Input/output scanning for LLM applications | |
| Rebuff | Self-hardening prompt injection detector | |
| NeMo Guardrails | NVIDIA's toolkit for adding guardrails to LLM-based applications | |
| Lakera Guard | Enterprise prompt injection defense API | Website |
| Promptfoo | LLM red teaming and evaluation framework | |
| Garak | LLM vulnerability scanner | |
| AgentSkillsScanner | Static analysis scanner for agent skill definitions | |
| Agent Audit | Security analysis system for LLM agent apps: dataflow analysis, credential detection, MCP config parsing, privilege-risk checks | Zhang et al. |
| mcp-sec-audit | MCP server security toolkit: static pattern matching + dynamic sandboxed fuzzing via Docker/eBPF for detecting over-privileged tool capabilities | Huang et al. |
| Specification | Org | Focus |
|---|---|---|
| AgentSkills.io | Open Standard | Agent skill definition and security requirements |
| Model Context Protocol (MCP) | Anthropic | Tool/resource integration protocol for LLMs |
| OpenAI Function Calling | OpenAI | Tool use specification for GPT models |
| Tool Use (Claude) | Anthropic | Claude's native tool use interface |
| LangChain Tools | LangChain | Tool abstraction for agent frameworks |
| AutoGPT Plugins | AutoGPT | Plugin system for autonomous agents |
| OpenAPI/Swagger | Linux Foundation | API specification commonly used as tool definitions |
- π Snowflake Cortex AI Escapes Sandbox and Executes Malware β PromptArmor, 2026. Prompt injection attack chain in Snowflake's Cortex Agent bypassed command allowlists via bash process substitution to achieve RCE; now patched.
- π Confused Deputy Attacks on Autonomous AI Agents β Cloud Security Alliance AI Safety Initiative, 2026. Research note on prompt injection chains enabling privilege escalation and autonomous compromise in AI agent systems.
- π How AI Assistants are Moving the Security Goalposts β Krebs on Security, 2026. AI agents as insider threats.
- π Anthropic: Challenges in Red Teaming AI Systems β Anthropic's perspective on evaluating agent safety.
- π OpenAI: Safety of Advanced AI Agents β Practices for governing agentic AI systems.
- π Compromising Agents via MCP β Invariant Labs deep-dive into MCP attack vectors.
- π Simon Willison: Prompt Injection Explained β Accessible introduction to prompt injection risks.
- π TRAIL: Trusted Reasoning and AI Logging β Logging framework for auditable agent execution.
- π Cyber Threat Intelligence for AI Systems β AI-specific CTI framework with IoCs for supply-chain phases.
- π AI Safety Has 12 Months Left β Window to embed safety into infrastructure before market forces prevent it.
- π LiteLLM Hack: Were You One of the 47,000? β FutureSearch via Simon Willison, 2026. Analysis of PyPI supply-chain attack on LiteLLM: 47K downloads of exploited packages in 46 minutes, 88% of 2,337 dependent packages had unpinned versions.
- π Exploiting Agentic Browsers: From False Information to Cross-Site Data Leaks β Trail of Bits, 2026. Demonstrates lack of isolation in agentic browsers enabling attacks from false information dissemination to cross-site data leaks, resurfacing decades-old web vulnerability patterns.
- awesome-llm-security β General LLM security resources.
- awesome-ai-safety β AI safety research and resources.
- awesome-chatgpt-prompts β Prompt engineering (includes adversarial examples).
- awesome-ml-for-cybersecurity β ML applied to cybersecurity.
- awesome-mcp-servers β MCP server ecosystem (attack surface reference).
- awesome-ai-agents β AI agent frameworks and projects.
Contributions are welcome! Please read the contribution guidelines before submitting a pull request.
- Fork the repository
- Add your resource in the appropriate category
- Use the format:
- π **[Title](URL)** β Authors, Venue Year. One-sentence description. - Submit a pull request
- Resources must be directly related to agent/tool/skill security
- Papers should be published or on arXiv
- Tools should be actively maintained (commits within last 6 months)
- Blog posts should provide substantial technical analysis
If you find this list useful in your research, please cite:
@misc{awesome-agent-skills-security,
author = {Liu, Yi},
title = {Awesome Agent Skills Security},
year = {2026},
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/LLMSecurity/awesome-agent-skills-security}}
}This list is released under CC0 1.0 Universal.
