Local Deep Research
Self-hosted AI & machine learning tool that provides AI-powered deep research tool.
Self-hosted AI deep research, honestly reviewed. No marketing fluff, just what you get when you run it on your own server.
TL;DR
- What it is: Self-hosted AI research assistant that does iterative, multi-source research — think Perplexity Deep Research, but the LLM runs on your hardware and none of your queries leave your network [README].
- Who it’s for: Researchers, analysts, and privacy-conscious founders who need real cited answers from arXiv, PubMed, the web, and their own private document collections — without paying per query or exposing sensitive research topics to cloud APIs [README].
- Cost savings: Perplexity Pro runs $20/mo for research features; OpenAI’s ChatGPT Pro with deep research is $200/mo. Local Deep Research runs on a VPS with an open-source LLM for roughly $6–15/mo in infrastructure, or free on hardware you already own [README][3].
- Key strength: Genuinely broad source coverage — web search via SearXNG, academic databases (arXiv, PubMed), and private document ingestion in one pipeline, with encrypted local storage via SQLCipher [README].
- Key weakness: Setup requires three separate Docker containers (Ollama, SearXNG, LDR itself), and the flagship ~95% SimpleQA accuracy benchmark was tested with GPT-4.1-mini — a cloud model, not a local one. Accuracy drops materially when you swap in a smaller local LLM [README].
What is Local Deep Research
Local Deep Research is an AI-powered research agent you run yourself. You give it a question; it breaks that question into sub-queries, searches multiple sources in parallel, synthesizes what it finds, and returns a report with citations. The whole pipeline — query planning, search, synthesis — is driven by whatever LLM you point it at, whether that’s a local Ollama model or a cloud API like Anthropic or Google [README].
The project describes itself as: “AI research assistant you control. Run locally for privacy, use any LLM and build your own searchable knowledge base. You own your data and see exactly how it works.” That’s a more honest summary than most tools manage. The pitch isn’t “magic AI” — it’s explicit control over every piece of the stack.
What separates it from a simple “chat with your documents” tool is the iterative search loop. It doesn’t just retrieve; it refines. A query about drug interactions, for example, would generate sub-queries hitting PubMed for clinical data, arXiv for preprints, and the web for context — then synthesize across all three with citations [README]. The benchmark claim of ~95% accuracy on the SimpleQA dataset positions it against Perplexity and Gemini in research quality, though that number requires important caveats covered below [README].
As of this review: 4,164 GitHub stars, MIT license, active Discord at discord.gg/ttcqQeFcJ3, a subreddit at r/LocalDeepResearch, and a YouTube channel with setup walkthroughs [README].
Why people choose it
The third-party review landscape for this specific tool is thin — it’s a younger project at 4K stars vs. the category leaders. What’s clear from the GitHub community and general self-hosted AI momentum is that the draw follows the same logic that applies to any privacy-critical tooling: your research questions are often more sensitive than your research answers [3].
When you run Perplexity queries about competitor strategy, patient case studies, M&A targets, or novel research hypotheses, those queries go to Perplexity’s servers and train their models. With Local Deep Research, the query stays local. The encrypted SQLCipher database means even your research history is protected at rest — something no SaaS research tool offers by default [README].
The second driver is academic source coverage. Perplexity and ChatGPT search the web. Local Deep Research adds arXiv and PubMed as first-class sources, which matters enormously for anyone doing literature reviews, scientific due diligence, or research-heavy content work [README]. No equivalent SaaS tool gives you structured PubMed querying bundled into a general research workflow at this price point.
The third driver is flexibility. You can run it with a local Ollama model for full air-gap privacy, or plug in your Anthropic/Google API key for higher accuracy when the question warrants it [README]. The same interface handles both; switching is a config change. This hybrid posture — local by default, cloud when needed — is something no pure SaaS tool can offer [3].
Features: what it actually does
Core research engine:
- Iterative multi-step research: breaks a question into sub-queries, searches, synthesizes, repeats [README]
- 10+ source integrations: web (via SearXNG), arXiv, PubMed, Wikipedia, and private document collections [README]
- Citation-backed reports — outputs aren’t “the AI thinks” but sourced claims [README]
- Private document search: ingest your own PDFs, notes, or internal docs into a searchable knowledge base [README]
LLM support:
- Local: Ollama (any model it supports — Llama, Mistral, Qwen, etc.) [README]
- Cloud: Google Gemini, Anthropic Claude, OpenAI [README]
- Mix-and-match: use a fast local model for sub-queries, a cloud model for final synthesis [README]
Infrastructure and security:
- SQLCipher encrypted database — research history is encrypted at rest, not plain SQLite [README]
- REST API for programmatic access [README]
- WebSocket support for real-time progress updates [README]
- Rate limiting built in [README]
- Metrics and dashboard for monitoring [README]
- Unraid community app available [README]
- OpenSSF Scorecard, CodeQL, and Semgrep security scanning in CI — more security rigor than most open-source projects this size [README]
Deployment:
- Docker (single container), Docker Compose (recommended), pip install [README]
- GPU acceleration via docker-compose.gpu.override.yml [README]
- Persistent volume for data at
/data[README]
Pricing: SaaS vs self-hosted math
Local Deep Research:
- Software: $0 (MIT license) [README]
- Infrastructure: $0 if you run on existing hardware; $6–15/mo on a cloud VPS
- LLM: $0 if using Ollama local models; API costs if using cloud LLMs
Perplexity (primary SaaS comparison):
- Free: basic search, limited deep research
- Pro: $20/mo — includes access to Perplexity’s research mode with cited sources
- Enterprise: custom pricing
OpenAI ChatGPT with research:
- Plus: $20/mo (includes some research features)
- Pro: $200/mo — full Deep Research access with extended reasoning
Concrete math for a research-heavy user:
If you’re a founder or analyst running 50–100 deep research queries per month — competitive analysis, technical due diligence, literature synthesis — and you’re on Perplexity Pro at $20/mo, the annual cost is $240. Self-hosting Local Deep Research on a $10/mo VPS with a decent Ollama model (Qwen2.5 32B, for example) costs $120/year. With existing hardware, it’s $0/year after setup time.
The real savings case is research volume. SaaS tools either rate-limit deep research or charge per query beyond thresholds. Local Deep Research has no such ceiling — you can run 500 queries in a weekend for the same infrastructure cost as running 5 [README].
Caveat: if you use cloud APIs (Anthropic, Google) for synthesis, those costs apply on top. A heavy user running GPT-4.1-mini for synthesis at scale could spend $5–20/mo in API credits. Still cheaper than Perplexity Pro for volume users; not relevant for pure-local setups.
Deployment reality check
This is where Local Deep Research is more demanding than most tools in its weight class. The recommended production setup requires three containers running simultaneously [README]:
- Ollama — serves local LLM inference
- SearXNG — provides privacy-respecting web search
- Local Deep Research — the application itself
The Docker Compose path handles all three, and the quick start is genuinely quick if you know Docker. One command pulls and starts everything [README]. The tricky parts are:
What you need:
- A machine with enough RAM to run a useful local model. Anything below 8GB RAM + 8GB VRAM will force you onto smaller models (7B range) with noticeably lower research quality
- Docker and docker-compose
- An Ollama model pulled — the README examples use
gpt-oss:20b, which requires pulling before first use [README] - A domain + reverse proxy (Caddy/nginx) if you want HTTPS access from outside localhost
- Optional: cloud API keys if you want hybrid local/cloud mode
What can go sideways:
- The 95% SimpleQA benchmark is explicitly tested with GPT-4.1-mini (a cloud API model, not a local one) [README]. Running with a 7B local model will give substantially lower accuracy on complex research tasks. The benchmark headline is real; applying it to a local model setup requires calibration.
- SearXNG configuration can be finicky — the default search results depend on which SearXNG instance you run and how it’s configured. A misconfigured SearXNG will quietly degrade research quality.
- “Private documents” search requires separate setup — the README mentions it as a feature, but ingesting documents isn’t zero-configuration.
- No website listed for official docs (the merged profile has an empty website field). The GitHub README and community Discord are the primary support channels [README].
The pip install option exists for users who don’t want Docker, but that path is less battle-tested for production use [README].
Realistic time estimate: 1–2 hours for a Docker-comfortable user on a fresh machine. A full afternoon including LLM model selection, SearXNG tuning, and getting research quality to an acceptable level. The YouTube channel has walkthroughs that compress this significantly [README].
Pros and cons
Pros
- Genuinely private research. Queries, results, and history stay on your infrastructure. SQLCipher encrypts everything at rest. For sensitive research this isn’t a nice-to-have — it’s the whole point [README].
- Academic source coverage. ArXiv and PubMed as first-class search targets alongside web search. No SaaS research tool at this price gives you structured literature search [README].
- MIT licensed. Clean license, no usage restrictions, no “fair-code” caveats. You can embed it in products, modify it, redistribute it [README].
- Hybrid LLM flexibility. Same tool, same interface — swap between local Ollama and cloud APIs based on query sensitivity or required accuracy [README].
- Serious security posture. OpenSSF Scorecard, CodeQL, Semgrep scanning in CI is a level of rigor you rarely see in a sub-5K-star project [README].
- REST API and WebSocket. Programmatic access for integrating research into pipelines or custom UIs [README].
- Benchmarked accuracy. ~95% on SimpleQA with GPT-4.1-mini is a concrete, reproducible number — not marketing copy [README]. Community benchmark results are in the repository.
Cons
- Three-container setup. Ollama + SearXNG + LDR is a meaningful operational overhead vs. opening a browser tab [README]. For non-technical users, this is a genuine barrier.
- Benchmark asterisk. The headline accuracy number uses a cloud model. Local model accuracy is not benchmarked in the README and will be meaningfully lower depending on model choice [README].
- No SaaS fallback. No managed cloud offering, no hosted tier. If self-hosting is off the table, this tool is off the table.
- Young project. 4,164 stars is growing but not yet in the “this project won’t disappear” confidence zone. No dedicated commercial entity backing it (based on available data).
- Thin third-party documentation. Most setup help lives in Discord and the YouTube channel, not formal docs. When something breaks, you’re debugging with community support.
- Private doc ingestion isn’t turnkey. It’s listed as a feature but requires more setup than the core research workflow [README].
- Resource-hungry for quality. Useful local models (20B+ parameters) need real hardware — 16GB+ RAM, dedicated VRAM. Budget deployments running 7B models will produce noticeably weaker research [README][3].
Who should use this / who shouldn’t
Use Local Deep Research if:
- You do research on sensitive topics where query privacy matters — competitor intelligence, medical research, legal analysis, proprietary technical due diligence.
- You’re a researcher or analyst regularly pulling from academic databases and want arXiv/PubMed integrated into your research workflow without per-query cost.
- You have the hardware (16GB+ RAM recommended) and comfort with Docker to deploy and maintain it.
- You’re already running Ollama for local LLM inference and want a research layer on top.
- You want to build research into a pipeline via REST API.
Skip it if:
- You’re a non-technical founder who just wants answers fast. Perplexity Pro at $20/mo costs less than the afternoon you’ll spend on setup.
- Your research questions aren’t sensitive. If privacy isn’t a factor, the SaaS tools are faster and require zero maintenance.
- You’re on constrained hardware. A 4GB RAM VPS will produce research quality that disappoints.
- You need a collaborative research tool for a team — no multi-user access management is described in available documentation.
Alternatives worth considering
- Perplexity Pro ($20/mo) — the obvious SaaS comparison. Better out-of-the-box experience, web-focused, no private documents, no air-gap option. Right answer if setup friction matters more than privacy.
- OpenAI ChatGPT Pro with Deep Research ($200/mo) — highest quality research synthesis available, but expensive and fully cloud-dependent. Overkill for most use cases.
- GPT Researcher (open source) — similar concept, Python-based, more developer-facing. Less polished UI but well-established in the open-source research agent space.
- Storm (Stanford OVAL) — open-source tool for generating Wikipedia-style long-form research articles. More focused on article generation than iterative research.
- Ollama + Open WebUI — if you want local LLM chat with document upload but don’t need the structured multi-source research pipeline. Simpler setup, less research-specific.
- AnythingLLM — local RAG over private documents with web search. Broader tool, less research-optimized.
The practical choice for most users is Local Deep Research vs. Perplexity Pro. Pick LDR if privacy and academic sources matter. Pick Perplexity if you want results in 30 seconds without a terminal.
Bottom line
Local Deep Research occupies a real gap: between “cloud AI that sees your queries” and “local chatbot with document upload.” What it actually delivers is a structured research pipeline — multi-source, citation-backed, configurable between local and cloud inference — that the SaaS tools won’t touch because their business model is the cloud. The tradeoffs are honest: you’re managing three Docker containers, you need real hardware for real quality, and the stellar accuracy benchmarks use a cloud model that isn’t free. For a privacy-conscious researcher who already runs Ollama or who works with sensitive materials that can’t touch cloud APIs, this is one of the more compelling self-hosted projects in the AI category right now. For everyone else, Perplexity at $20/mo is a better use of an afternoon.
If the Docker setup is what’s blocking you, that’s exactly the kind of deployment that upready.dev handles for clients — one-time, you own it.
Sources
Primary sources:
- GitHub repository and README: https://github.com/learningcircuit/local-deep-research (4,164 stars, MIT license)
- Docker Hub: https://hub.docker.com/r/localdeepresearch/local-deep-research
- Community: https://discord.gg/ttcqQeFcJ3 · https://www.reddit.com/r/LocalDeepResearch/
Referenced articles: 3. Tim Trott, LoneWolfOnline — “Complete Guide to Self-Hosting AI Models Locally on Your Hardware” (October 14, 2024). https://lonewolfonline.net/diy-projects/self-hosting-ai-models-locally/
Features
Integrations & APIs
- REST API
- WebSocket Support
AI & Machine Learning
- AI / LLM Integration
Analytics & Reporting
- Dashboard
- Metrics & KPIs
- Reports
Security & Privacy
- Encryption
- Privacy-Focused
- Rate Limiting
Category
Compare Local Deep Research
Related AI & Machine Learning Tools
View all 93 →OpenClaw
320KPersonal AI assistant you run on your own devices. 25+ messaging channels, voice, cron jobs, browser control, and a skills system.
Ollama
166KRun open-source LLMs locally — get up and running with DeepSeek, Qwen, Gemma, Llama, and more with a single command.
Open WebUI
128KRun AI on your own terms. Connect any model, extend with code, protect what matters—without compromise.
OpenCode
124KThe open-source AI coding agent — free models included, or connect Claude, GPT, Gemini, and 75+ other providers.
Zed
77KA high-performance code editor built from scratch in Rust by the creators of Atom — GPU-accelerated rendering, built-in AI, real-time multiplayer, and no Electron.
OpenHands
69KThe open-source, model-agnostic platform for cloud coding agents — automate real software engineering tasks with sandboxed execution, SDK, CLI, and enterprise-grade security.