diff --git a/README.md b/README.md index 300bf46..4da7072 100644 --- a/README.md +++ b/README.md @@ -137,6 +137,7 @@ An awesome & curated list of the best LLMOps tools for developers. | [langchain-serve](https://github.com/jina-ai/langchain-serve) | Serverless LLM apps on Production with Jina AI Cloud *(Archived)* | ![GitHub Badge](https://img.shields.io/github/stars/jina-ai/langchain-serve.svg?style=flat-square) | | [lanarky](https://github.com/ajndkr/lanarky) | FastAPI framework to build production-grade LLM applications | ![GitHub Badge](https://img.shields.io/github/stars/ajndkr/lanarky.svg?style=flat-square) | | [ray-llm](https://github.com/ray-project/ray-llm) | LLMs on Ray - RayLLM *(Archived)* | ![GitHub Badge](https://img.shields.io/github/stars/ray-project/ray-llm.svg?style=flat-square) | +| [SBproxy](https://github.com/soapbucket/sbproxy) | Single-binary AI gateway and reverse proxy with 103+ LLM providers, cost-based routing, rate limiting, guardrails, and semantic caching. `Apache-2.0` | ![GitHub Badge](https://img.shields.io/github/stars/soapbucket/sbproxy.svg?style=flat-square) | | [Xinference](https://github.com/xorbitsai/inference) | Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop. | ![GitHub Badge](https://img.shields.io/github/stars/xorbitsai/inference.svg?style=flat-square) | | [KubeAI](https://github.com/substratusai/kubeai) | Deploy and scale machine learning models on Kubernetes. Built for LLMs, embeddings, and speech-to-text. | ![GitHub Badge](https://img.shields.io/github/stars/substratusai/kubeai.svg?style=flat-square) | | [Kaito](https://github.com/kaito-project/kaito) | A Kubernetes operator that simplifies serving and tuning large AI models (e.g. Falcon or phi-3) using container images and GPU auto-provisioning. Includes an OpenAI-compatible server for inference and preset configurations for popular runtimes such as vLLM and transformers. | ![GitHub Badge](https://img.shields.io/github/stars/kaito-project/kaito.svg?style=flat-square) |