Self-hosted AI vs. cloud vs. dedicated GPU: a real cost comparison for small businesses

Every week, a client asks me the same question: “Should we run AI on our own hardware or just use ChatGPT?”

The answer is always “it depends,” which is the most unhelpful sentence in consulting. So here’s my attempt to make it actually helpful — with real hardware prices, real API bills, and a framework you can use to decide in 15 minutes.

The three options

There are really only three ways to run AI for your business. Everything else is a variation.

Option 1: Cloud AI providers. You send your data to OpenAI, Anthropic, or Google. They process it on their servers and send back a response. Cheapest to start. Easiest to set up. Your data leaves your building.

Option 2: A Mac Mini (or similar) in your office. You run an open-source AI model on a machine you physically own. Your data never leaves. More limited in capability, but for most business tasks, it’s plenty.

Option 3: A dedicated GPU server in a data center. You rent serious hardware from a European hosting provider like Hetzner. More powerful than a Mac Mini, your data stays in the EU, but it costs more and needs someone to maintain it.

Each one makes sense for different businesses. None of them is universally best.

Option 1: Cloud AI — the numbers

Cloud AI pricing is per-token (roughly per word). Here’s what the major providers charge as of March 2026:

Anthropic (Claude):

Haiku 4.5 (fast, good for simple tasks): $1 / $5 per million tokens in/out
Sonnet 4.6 (balanced): $3 / $15 per million tokens
Opus 4.6 (most capable): $5 / $25 per million tokens

OpenAI (GPT):

GPT-5.2 (workhorse): $1.75 / $7 per million tokens
GPT-5.4 (latest): $2.50 / $15 per million tokens
GPT-4o Mini (budget): $0.15 / $0.60 per million tokens

What does that mean in real money? For a typical small business running AI agents — email triage, document summaries, CRM updates, maybe some customer communication — you’re looking at EUR 20-80/month in API costs. Heavy users (lots of long documents, frequent queries) might hit EUR 150.

The cost has dropped dramatically. A year ago, the same workload would’ve cost 3-5x more. The trend is clear and it’s not reversing.

What you give up: Your data travels to US servers. Every client email, every invoice, every document you feed the AI — it’s processed on someone else’s infrastructure. I wrote about what that means for GDPR in a separate post, but the short version: you need a Data Processing Agreement, a transfer impact assessment, and documentation. Most small businesses have none of these.

What you get: Access to the best models available. Claude Opus and GPT-5.4 are genuinely better at complex reasoning, nuanced writing, and creative problem-solving than anything you can run locally. Zero hardware investment. You can start this afternoon.

Option 2: Mac Mini in your office — the numbers

Apple’s Mac Mini M4 Pro is the most popular choice for local AI, and for good reason. It’s small, silent, energy-efficient, and surprisingly powerful for inference.

Hardware cost: EUR 1,470-1,700 for the M4 Pro (24GB unified memory). That’s your only upfront cost. If you need to handle larger models, the 48GB configuration runs about EUR 2,200.

Running cost: Around EUR 8-12/month in electricity. The M4 Pro draws 40-65 watts under AI workloads. In most of Europe, that’s nearly nothing on your power bill.

What models can you run? With 24GB of memory, you can comfortably run Qwen 3.5 (7-14B parameter versions), Llama 3.3, or DeepSeek V3 in smaller configurations. These models handle email processing, document summaries, data extraction, and basic customer communication well. They won’t write poetry as well as Claude Opus, but they’ll sort your invoices just fine.

With 48GB, you can run larger models — Qwen 3.5 72B quantized, for example — that start approaching cloud model quality for most business tasks.

What you give up: The best models won’t fit. You’re running smaller, quantized versions of open-source models. For 90% of business automation — email triage, invoice processing, scheduling, data extraction — they’re good enough. For tasks that require serious reasoning or working with very long documents, they’ll stumble where cloud models won’t.

You also need someone to set it up and occasionally maintain it. It’s not plug-and-play. But once configured, a Mac Mini running OpenClaw with a local model is remarkably stable.

What you get: Total data sovereignty. Your client data, financial records, employee information — none of it ever touches the internet. For legal firms, healthcare practices, and accounting firms handling sensitive client data, this isn’t a nice-to-have. It’s the difference between sleeping well and hoping nobody audits your GDPR compliance.

Break-even vs. cloud: If your cloud API bill would be EUR 50/month, the Mac Mini pays for itself in about 30 months. At EUR 100/month in cloud costs, you break even in 15 months. After that, you’re running at essentially EUR 10/month in electricity. The math gets better every month.

Option 3: Dedicated GPU server — the numbers

If you need more power than a Mac Mini but don’t want data leaving the EU, a dedicated GPU server is the middle ground.

Hetzner GEX44 (Germany, NVIDIA RTX 4000 SFF Ada, 20GB VRAM, 64GB RAM):

EUR 213/month starting April 2026 (up from EUR 182, thanks to a 16% price hike)
EUR 79 one-time setup fee
Data centers in Germany and Finland

That’s the entry point. For heavier AI workloads — larger models, higher throughput, multiple concurrent users — you’re looking at EUR 300-500/month for servers with beefier GPUs.

What you can run: Significantly larger models than a Mac Mini. With 20GB of VRAM plus 64GB of system RAM, you can run 70B+ parameter models at reasonable speed. That puts you in territory where the quality gap between local and cloud models shrinks considerably. You can also handle more concurrent requests — useful if multiple team members are using the AI simultaneously.

What you give up: It’s the most expensive option monthly. EUR 213/month is more than most small businesses spend on cloud AI APIs. And unlike the Mac Mini, you don’t own the hardware — if you cancel, you’ve got nothing to show for the payments.

You also need more technical knowledge to manage it. Server administration, model updates, security patching. This isn’t something a non-technical business owner should manage directly.

What you get: EU-hosted, powerful AI that you control. Hetzner’s German data centers mean your data stays in the EU by default. No international transfers, no DPA gymnastics with US companies. And enough horsepower to run models that rival cloud quality for business tasks.

Who it’s for: Businesses that need privacy AND capability. If you’re processing hundreds of documents daily, serving multiple locations, or running complex AI workflows that a Mac Mini can’t handle — but you can’t accept the GDPR exposure of cloud providers. I see this most with property management companies handling tenant data across multiple buildings, and multi-location dental or medical practices.

The decision framework

Here’s how I walk clients through this. Takes about 15 minutes.

Question 1: What data are you feeding the AI?

If it’s internal documents, marketing copy, public information — cloud is fine. Cheapest, easiest, best models. Done.

If it’s client financial records, medical data, legal documents, personal employee data — you need local or EU-hosted. The GDPR exposure of sending that to US servers isn’t worth the savings.

If you’re not sure what data will flow through the system (most businesses aren’t, at first) — start with cloud for non-sensitive tasks, and add local processing later for sensitive ones. Hybrid setups are more common than pure anything.

Question 2: How much volume?

Less than EUR 100/month in cloud API costs? Stick with cloud or get a Mac Mini if privacy matters.

EUR 100-300/month in equivalent workload? A Mac Mini starts making financial sense even without the privacy angle.

EUR 300+/month? A dedicated GPU server might be cheaper than cloud APIs and gives you EU hosting as a bonus.

Question 3: Who’s maintaining this?

If nobody on your team is technical and you don’t want to pay a consultant for ongoing support — cloud AI is the lowest-maintenance option by a wide margin. The provider handles everything.

A Mac Mini with OpenClaw needs occasional attention — model updates, troubleshooting, making sure the thing is still running after a power outage. Maybe 1-2 hours per month if things are stable.

A dedicated GPU server needs real administration. If you don’t have an IT person or a consultant on retainer, this isn’t for you.

Question 4: What’s your timeline?

Need AI running by next week? Cloud. You can sign up for an API key and have an agent running the same day.

Can wait 1-2 weeks? Mac Mini. Order the hardware, configure it, test it.

Can plan for a month? Dedicated server. Provision, set up, migrate workloads gradually.

What most of my clients actually choose

About 60% go with cloud AI. It’s simple, it’s cheap, and their data isn’t sensitive enough to justify the complexity of self-hosting. They get a DPA from their AI provider, document the data flows, and move on.

About 30% go with a Mac Mini. These are almost always businesses handling sensitive client data — accountants, lawyers, healthcare providers. They want to look their clients in the eye and say “your data never left our office.” That’s worth EUR 1,500 in hardware.

About 10% need the dedicated server. Multiple locations, high volume, or both. They need the power AND the privacy, and they have the budget for it.

Nobody is wrong. The right answer is the one that matches your data sensitivity, your budget, and your team’s technical comfort.

One more thing

These three options aren’t mutually exclusive. The setup I recommend most often is actually a hybrid: cloud AI for general tasks (drafting emails, research, scheduling), local AI on a Mac Mini for anything touching sensitive client data.

You get the best models for everyday work and total privacy for the stuff that matters. Total cost: EUR 1,500 upfront for the Mac Mini plus EUR 30-80/month in cloud API fees. That’s less than one part-time employee and covers automation that would otherwise take 20+ hours a week.

If you’re trying to figure out which setup fits your business, get in touch. I’ll walk you through the decision in a 30-minute call — and if cloud AI with no self-hosting is the right answer, I’ll tell you that and save you the consulting fee.