Oct 19, 2025

What I Learned Trying to Fine-Tune Gemma:2B on My Text History

I spent the weekend trying to turn my own messaging history into a bespoke assistant using Ollama’s build system and the Gemma:2B base model. The idea felt sensible enough: export the chat logs I legally own, clean them into a tidy instruction/response dataset, and layer a lightweight adapter on top of Gemma so it can imitate my tone when it helps me draft replies.

Preparing the data

Most of the effort went into wrangling the texts. I filtered obvious personal identifiers, normalised spelling, and standardised timestamps before converting everything into the chat-style JSON that Ollama expects. That alone taught me how inconsistent my own writing can be—apparently I oscillate between verbose paragraphs and one-word replies with equal enthusiasm.

Once the dataset was ready, running ollama create with a custom Modelfile was surprisingly smooth. I kept the changes small, leaning on the base model’s capabilities and only nudging the style with my prompts and examples.

Local success

On my laptop the model behaved brilliantly. It replied with the exact mix of dry humour and overthinking that my friends lovingly complain about. The responses were quick, the tone was spot on, and even my partner admitted the output sounded uncannily like me. At this point I was convinced it could become the backbone of the assistant at ai.carznari.com.

Deployment faceplant

That confidence evaporated when I tried to publish it. The container that serves the site simply refused to load the custom model. Ollama logs showed the adapter initialising, but the instance timed out when the web app requested completions. After a few retries the service rolled back to the stock Gemma image, which meant the live site never saw my changes. Locally? Flawless. Remotely? The model might as well not exist.

Takeaways for next time

Environment parity matters. My local machine has heaps of free disk and memory; the production host does not. Gemma:2B with an extra adapter needs more breathing room than I budgeted.
Deployment scripts need guardrails. I should have added a health-check stage before promoting the new model to production. Instead, I pushed it straight to ai.carznari.com and only noticed the failure once the site was already serving fallbacks.
Data pipelines deserve versioning. Every time I regenerate the texting dataset I tweak the cleaning rules. Next pass will live in a proper repo so I can track those adjustments.

For now the local build stays in my notebook, but the experiment taught me a lot about packaging custom models. Once I increase the server resources and harden the deployment pipeline, I’ll give the Gemma:2B remix another shot—because honestly, it’s too much fun hearing my own AI clone roast me back.