What happens to your data when you use ChatGPT (and why you should care)
Last month, a client told me their team had been pasting client contracts into ChatGPT to get summaries. For six months. Nobody had thought about where that data was going.
They’re not unusual. A 2025 Cyberhaven report found that 11% of data employees paste into ChatGPT is confidential. At companies with more than 500 employees, that number jumps to 20%. Small businesses are probably worse — they just don’t have the monitoring tools to know it.
What actually happens to your data
When you type something into ChatGPT’s web interface, your text travels from your computer to OpenAI’s servers. Those servers are in the US. Your data is processed there, and a response is generated.
Here’s what OpenAI’s terms say (simplified):
- They may use your conversations to improve their models, unless you opt out
- They retain your data for up to 30 days, even after you delete your account
- They share data with “service providers” and can disclose it in response to legal requests
- Their API product has different (better) terms than the consumer product
Google’s Gemini, Anthropic’s Claude, and Microsoft’s Copilot have similar setups with their own variations. The details differ but the structure is the same: your data goes to their servers, they process it, and they decide what happens to it after that.
Most people don’t read these terms. I don’t blame them — they’re long and deliberately vague in key places.
The GDPR problem nobody talks about
If you’re a European business, every time you paste client data into a US-based AI service, you’re making an international data transfer under GDPR. That requires:
- A valid legal basis for the transfer
- A Data Processing Agreement with the AI provider
- A Transfer Impact Assessment
- Documentation in your records of processing
How many SMBs do this? Essentially none that I’ve seen. The GDPR was passed in 2016, fully enforced since 2018, and most small businesses still treat cloud AI tools like they’re just another website. They’re not. They’re data processors.
GDPR fines hit EUR 5.65 billion through 2025 and enforcement is accelerating. The DPAs across Europe are paying more attention to AI-related data flows specifically. Italy temporarily banned ChatGPT in 2023 over GDPR concerns. That was a warning shot.
Beyond legal compliance: the practical risks
Legal risk aside, there are practical reasons to worry.
Model training. If your data gets used to train a model, fragments of it could theoretically surface in responses to other users. This has happened — researchers have extracted training data from language models. The probability is low for any specific piece of data, but it’s not zero.
Security breaches. OpenAI disclosed a data breach in March 2023 where some users could see other users’ chat titles and payment information. Cloud services get breached. It’s not a question of if, but when.
Policy changes. Companies change their terms of service regularly. What’s opt-out today might be opt-in tomorrow. You’re trusting a company to maintain a privacy posture indefinitely. That’s a bet, not a guarantee.
Employee behavior. Even if you have a policy against pasting sensitive data into AI tools, people do it anyway. The Cyberhaven numbers above tell that story clearly.
What self-hosted AI changes
When an AI model runs on hardware you control — whether that’s a physical machine in your office or a European VPS — the entire picture changes.
With a fully local deployment, your data doesn’t leave your infrastructure. There’s no third-party processor, no international transfer, no terms of service to parse, no policy changes to track. The model runs locally, processes your request locally, and returns the result locally.
Not every workflow needs that level of isolation. If you’re summarizing marketing briefs or drafting internal memos, a cloud AI provider is cheaper and perfectly fine. The key is having the option — and knowing which of your data flows actually need local processing.
For healthcare practices handling patient records or legal firms managing privileged communications, local deployment isn’t a nice-to-have — it’s arguably a professional responsibility. For a restaurant automating reservation confirmations, cloud is probably fine. Match the setup to the sensitivity.
The trade-offs (because there are always trade-offs)
Self-hosted AI isn’t magic. Here’s what you give up:
Setup time. Cloud AI is instant. Self-hosted requires configuration — typically 2-5 days with professional help. I walk through how the setup process works on my site.
Model capability. GPT-4o and Claude are still better at complex reasoning and creative tasks. For daily business automation — email, documents, scheduling, data extraction — open-source models are plenty good. But they won’t write your next marketing campaign as well as the top cloud models.
Maintenance. Someone needs to keep the system running, update models, and troubleshoot issues. With cloud AI, that’s the provider’s job. With self-hosted, it’s yours (or your consultant’s).
Those are real costs. For many businesses, the right answer is a hybrid approach: self-hosted AI for sensitive operations, cloud AI for tasks where privacy doesn’t matter.
What I’d recommend
Stop and think about what data flows through your AI tools. Seriously. Open your ChatGPT history right now and scroll through it. Is there client data in there? Financial information? Internal documents?
If the answer is yes — and it probably is — you have two options. Either lock down your cloud AI usage with proper DPAs, opt-outs, and employee training, or move your sensitive workflows to a self-hosted solution.
Option one is cheaper upfront but requires ongoing vigilance. Option two costs more initially but eliminates the problem at the root.
I help businesses set up self-hosted AI agents. But even if you never work with me, please do the audit. Know where your data is going. Your clients are trusting you with their information — make sure that trust is justified.
Book a free call. I'll tell you exactly what I'd automate first, what hardware you need, and what the whole thing costs. No surprises.
Book a free call