unsubbed.co
Home / AI Platforms

AI Inference Platforms for Self-Hosted Tools

Run AI models locally or in the cloud. These platforms power self-hosted AI tools like chatbots, coding assistants, and document search.

13 local / self-hosted 15 cloud providers

Local AI / Self-Hosted

Run AI models on your own hardware. Full data privacy, no API costs, no internet required.

Ollama

Ollama stars

Run large language models locally with a single binary -- the easiest way to get started

Open Source 20 tools →
open-webui

Open WebUI

Open WebUI stars

Self-hosted ChatGPT-like interface for Ollama and OpenAI-compatible APIs

Open Source 20 tools →
L

llama.cpp

llama.cpp stars

Highly optimized C/C++ inference engine for LLMs -- the foundation most local AI tools build on

Open Source 20 tools →
G

GPT4All

GPT4All stars

Desktop application for running LLMs locally with document chat and a friendly GUI

Open Source 20 tools →
V

vLLM

vLLM stars

High-throughput LLM serving engine -- the performance king for production deployments

Open Source 20 tools →
anythingllm

AnythingLLM

AnythingLLM stars

All-in-one AI chatbot framework for building local AI agents that interact with your data

Open Source 20 tools →
T

text-generation-webui

text-generation-webui stars

Gradio-based web UI for running large language models with advanced parameter tuning

Open Source 20 tools →
localai

LocalAI

LocalAI stars

Universal API hub that routes requests to multiple AI backends through OpenAI-compatible endpoints

Open Source 20 tools →
jan

Jan

Jan stars

Open-source desktop alternative to ChatGPT with agentic workflows and project workspaces

Open Source 20 tools →
librechat

LibreChat

LibreChat stars

Polished ChatGPT-style web UI that unifies multiple AI backends with enterprise features

Open Source 20 tools →
K

KoboldCpp

KoboldCpp stars

Lightweight standalone app for running GGUF models locally, especially popular for roleplay/fiction

Open Source 20 tools →
L

LM Studio

LM Studio stars

Desktop app for discovering, downloading, and running local LLMs with a polished interface

Free 20 tools →
lobechat

LobeChat

LobeChat stars

Most polished self-hosted ChatGPT alternative with beautiful UI and plugin ecosystem

Open Source 20 tools →

Cloud AI Inference

Access AI models via API without managing GPU infrastructure. Pay per use or per hour.

B

Baseten

Model serving platform that exposes models as HTTP endpoints with built-in autoscaling

Paid 20 tools →
C

Cerebras

AI inference on custom wafer-scale chips delivering 2000+ tokens per second

Paid 20 tools →
D

DeepInfra

Pay-per-use AI inference with OpenAI-compatible API for easy migration

Paid 20 tools →
F

fal.ai

Generative AI platform specializing in fast image and video model inference

Freemium 20 tools →
F

Fireworks AI

Production-grade inference platform with proprietary FireAttention engine for speed and scale

Paid 20 tools →
G

Groq

Ultra-fast AI inference using custom LPU hardware -- fastest tokens-per-second in the industry

Freemium 20 tools →
H

Hugging Face Inference

Inference API for 400K+ models on Hugging Face Hub with serverless and dedicated endpoints

Freemium 20 tools →
L

Lambda Cloud

GPU cloud built for deep learning with on-demand and reserved NVIDIA GPU instances

Paid 20 tools →

Modal

Serverless cloud for AI/ML with Python-native interface and dynamic GPU scaling

Freemium 20 tools →
R

Replicate

Cloud platform to run ML models via API without managing infrastructure

Paid 20 tools →
R

RunPod

GPU cloud for AI inference and training with serverless and dedicated pod options

Paid 20 tools →
S

SambaNova

Enterprise AI platform with custom RDU chips optimized for high-performance inference

Paid 20 tools →
S

SiliconFlow

Low-cost AI inference platform with 2.3x faster speeds and transparent pay-per-use pricing

Paid 20 tools →
T

Together AI

Cloud platform for running hundreds of open-source AI models with community-driven approach

Paid 20 tools →
V

Vast.ai

Decentralized GPU marketplace for renting idle compute at the lowest prices

Paid 20 tools →

Need help setting up local AI?

We deploy self-hosted AI infrastructure for businesses. From Ollama on a single server to vLLM clusters with GPU autoscaling.

Visit upready.dev →