LLM Hosting links | Simon Frey Open Link List

litellm

LiteLLM is a Python SDK and proxy server to unify 100+ LLM APIs, enabling calls in OpenAI format. This is a critical tool for abstracting LLM providers, offering features like cost tracking, load balancing, and guardrails.

github.com · 30 DEC '25

openinference

OpenInference delivers OpenTelemetry instrumentation for AI observability. This allows tracing AI applications, including LLMs and their ecosystem components, providing critical insights into their runtime behavior across any OpenTelemetry-compatible backend.

github.com · 30 DEC '25

llama.cpp

This is my go-to C library for efficiently running LLaMA and other LLMs locally on consumer hardware, especially useful for Mac users where it shines on Apple Silicon. It really democratizes local AI inference.

docs.vllm.ai · 30 DEC '25

vLLM

vLLM is an LLM hosting framework designed for fast and efficient LLM inference and serving. Its features like continuous batching and automatic prefix caching significantly improve throughput for online and offline workloads.

LLM Hosting entries

litellm

openinference

llama.cpp

vLLM