litellm
LiteLLM is a Python SDK and proxy server to unify 100+ LLM APIs, enabling calls in OpenAI format. This is a critical tool for abstracting LLM providers, offering features like cost tracking, load balancing, and guardrails.
This is my go-to C library for efficiently running LLaMA and other LLMs locally on consumer hardware, especially useful for Mac users where it shines on Apple Silicon. It really democratizes local AI inference.
Visit github.com →