I Moved My Entire AI Stack Off the Cloud Today. Here’s Why — and How.

ByDevin April 4, 2026April 19, 2026

There is a quiet shift happening in how serious AI builders operate. Not the hype cycles. Not the press releases. The actual builders — the ones running automations, building agents, shipping personal tools — are moving their stacks off the cloud and onto their own machines.

Today I did it too. Here is what changed, why it matters, and exactly how I did it.

The problem with cloud-only AI

I Moved My Entire AI Stack Off the Cloud Today. Here’s Why — and How.

I have been running BruBot — my personal AI agent — on a Hetzner VPS for months. 339 Python scripts, 20 cron jobs, Telegram interface, morning briefs, portfolio tracking, outreach automation. Powerful. Never sleeps.

But it has a fundamental problem: every single thing it does goes through someone else server.

$0.04 → $0.00 | ~1.5s
per morning brief (before → after) · local response time

Step 1: Install Ollama (2 minutes)

Ollama is the simplest way to run open-source LLMs locally. It runs a local API server on localhost:11434 with the same REST interface as OpenAI — meaning any tool built for OpenAI works with Ollama, zero code changes.

# macOS install
brew install ollama

# Start the server
ollama serve

Step 2: Pull Gemma 3

Google Gemma 3 is the sleeper pick right now. Fast, sharp, genuinely good at reasoning — and it runs on a MacBook Pro without breaking a sweat.

ollama pull gemma3:4b
ollama pull gemma3:12b
ollama run gemma3:4b "What should I focus on if I lead an SDR team?"

Response time: under 2 seconds. No API key. No billing. No rate limits.

Step 3: Wire it into your existing stack

# Before
client = openai.OpenAI(api_key=OPENAI_KEY)

# After
client = openai.OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")

What I actually use Gemma for

Task	Model	Why
Morning brief coaching	Gemma 3:4b	Fast, free, local
Complex reasoning	Claude Sonnet	Worth the API cost

The full local stack

~/.openclaw/workspace/   ← 339 scripts
localhost:11434          ← Ollama + Gemma 3
localhost:3030           ← ScreenPipe OCR

Why this matters

⚡ Speed: ~1.5s response vs 4–8s for GPT-4o

🔒 Privacy: Health, portfolio, calendar — none routes through OpenAI

🟢 Reliability: No rate limits. No outages.

💰 Cost: /bin/zsh per morning brief vs /bin/zsh.04

The surprise

Gemma 3 is genuinely good. Not good for a local model. Just good. The coaching insights are sharp. The SDR strategy prompts are solid. I expected to notice the quality drop. I did not.

10-minute quickstart

brew install ollama
ollama serve
ollama pull gemma3:4b
Swap base_url to http://localhost:11434/v1

Ollama is installed. Gemma is running. My morning brief costs exactly $0 in AI API calls. That feels good.

Uncategorized

5.4 and Tired: The Day Faith Held But Everything Else Broke
ByDevin May 28, 2026June 6, 2026

5.4 and Tired: The Day Faith Held But Everything Else Broke Thursday, May 28 · Tel Aviv · A systems failure disguised as a normal day What the Numbers Actually Said The Life OS dashboard came back with 5.4/10 this morning, and the mood tag said “tired.” I read it and didn’t argue. Some days…

Read More 5.4 and Tired: The Day Faith Held But Everything Else Broke
Uncategorized

4.4 and Fog: When the Morning Systems Fail, Everything Else Collapses
ByDevin May 21, 2026May 21, 2026

The Numbers Don’t Lie (Even When You Wish They Would) 4.4 out of 10. That’s not a day that’s bad enough to be a disaster story, and not good enough to feel like I made progress. It’s the score of a day that felt like operating underwater—everything took twice as long, decisions felt heavier, and…

Read More 4.4 and Fog: When the Morning Systems Fail, Everything Else Collapses
Uncategorized

7.8 and the Sunday Paradox: When Your Habits Score Perfect but Your Systems Are Broken
ByDevin May 24, 2026June 6, 2026

The Numbers First 7.8/10. Focused. That’s what the dashboard says about today, and it’s technically not wrong—but it’s also incomplete in a way that honest scoring should never be. Here’s what the Life OS actually captured: habits=10 (perfect sweep, 12/12 completion), health=9 (HIIT session, 40 minutes, peak HR 159), faith=8 (Tefillin done, the spiritual routine…

Read More 7.8 and the Sunday Paradox: When Your Habits Score Perfect but Your Systems Are Broken
OpenClaw (Bru Bot)

BruBot Daily Report — Tuesday, March 10, 2026
ByDevin March 10, 2026April 19, 2026

🔧 Infrastructure Achievements ✅ Google Workspace Unified OAuth Consolidated 5 separate OAuth integrations (Gmail, Drive, Sheets, Calendar, Photos) into single Application Default Credentials (ADC) with permanent refresh token. All 7 Google services verified working via live API calls. Planned 80% complexity reduction across all Google integrations. ✅ Comprehensive Skill Health Audit — LIVE Built live…

Read More BruBot Daily Report — Tuesday, March 10, 2026
Uncategorized

How I Connected with 301 People at One Company in an Afternoon (And What It Means for Outbound)
ByDevin March 12, 2026March 16, 2026

A congratulations connection request is not cold outreach. Here is how loading 301 profiles into Meet Alfred after a funding announcement changed the way I think about outbound forever.

Read More How I Connected with 301 People at One Company in an Afternoon (And What It Means for Outbound)
Uncategorized

When Banking Apps Steal Your Productivity
ByDevin April 7, 2026April 19, 2026

Today’s screen time tells a story: one frozen card, multiple messaging platforms, and zero actual coding despite opening Claude three times. Sometimes the best productivity hack is knowing when you’re not being productive at all. The Banking App Rabbit Hole I spent 128 frames wrestling with Comet, Isracard’s mobile banking app. A card freeze that…

Read More When Banking Apps Steal Your Productivity