Running local AI on the Raspberry Pi 5 taught me why cloud models continue to win

With the exception of Google’s AI responses to searches, I’ve never used AI before, not once. Most people I know, when they hear this, say, “You haven’t used AI? But it’s so useful! » I did not agree; I had seen the results given by Google’s AI, and honestly, it seemed overrated. Then I thought about it and realized I couldn’t really judge it without testing it. However, being passionate about technology, I did not want to risk my privacy, knowing the impact it could have, so I wanted to install the AIs locally.

I started self-hosting LLMs and absolutely loved it

Who needs OpenAI when your home lab can do the thinking for you?

However, finding a device to install them on seemed a bit tricky. I have a PC, but it’s very old and doesn’t have a very good GPU. And while I could have used it, its energy efficiency is terrible. Then I remembered I had a Raspberry Pi 5 lying around, which would be perfect. So I looked for how to install the AI ​​there. Turns out it was pretty easy. Here’s how I did it, along with the pros and cons of using AI on the Raspberry Pi 5.

Installing Ollama was the easy part

Ollama was the easiest AI platform to install

Based on my research, Ollama would be the easiest AI platform to install on my Raspberry Pi 5, so that’s the option I chose. The installation was quite simple. I had already installed the Pi-Apps app store, so all I had to do was install Ollama from the menus. I started installing it, and overall it was quick and painless! However, there was a problem: my SD card was 32GB, and about 16GB was already taken up by the Raspberry Pi OS, which meant that Ollama’s GUI installer couldn’t install the bundled templates.

This wasn’t a big problem though, as installing templates with Ollama is relatively simple. I opened a terminal and ran “ollama pull” followed by the model name found on the Ollama website. Then it took me about 3-4 hours to download the 3 designs I chose, since I downloaded them at the same time, and I was ready to go! I didn’t know this, but I could have also used the GUI to install a template. The 3 models I chose were all ones I had already heard of: Llama, Gemma and Deepseek. I chose to install the smallest version of each model because I only had 32 GB of space on the SD card.

Use of LLMs on the SBC was reactive

The AIs were very fast and didn’t take long to start up

Starting with Deepseek-R1, it took about a minute to start and then about 30 seconds to finish. Llama3.2 took about 5-10 seconds to start and 30 seconds to finish. Gemma 3, the fastest, took about 2-3 seconds to start writing and about 30 seconds to finish. It was very quick, even compared to the time it would take me to do a Google search and sift through the answers. And this is coming from someone who usually gets decent results from the first unsponsored spot, the first time.

It was a shocking experience to see a result coming so quickly that I immediately thought it was fake. And that’s the problem with fast response times. Some of the answers were correct: all models had the correct “IT” and “CISSP” abbreviations. This was really impressive to me, as if it could answer this question correctly, surely it could also answer other basic questions correctly.

But don’t try to push them too hard

The AIs were wrong more often than not, even about basic things.

However, the more I studied the answers, the worse the situation seemed. When asked: “What does the fox say?” “, none of the models knew I was referencing the Ylvis meme song from 2013. Even worse, when asked about Bill & Ted’s Excellent Adventure, none of the models except Llama knew what I was referring to. And Llama mixed up the actors from Bill and Ted, among others.

Worse yet, when asked who the 60th President of the United States was, they all hallucinated that it was Joe Biden, except for Deepseek, who assumed it was William McKinley. Which doesn’t really make sense; he thought the 60th president would be in 1841, a far cry from 2025, when the 47th and current president was elected. Even if every president had two non-consecutive terms, the 60th president would still be in 2029, well after 1841.

And only Llama could say that “The old man the boat” is an appropriate phrase when asked about the imaginary grammatical error. Although it may seem incorrect, these are elderly people driving a boat. Deepseek assumed that the old man and the boat were two separate objects, while Gemma assumed, more confusingly, that the old man was the boat. While this phrase may seem confusing, I certainly don’t think any of these interpretations make sense to a human.

A render of the Raspberry Pi 5

Processor

Arm Cortex-A76 (quad core, 2.4 GHz)

Memory

Up to 8 GB LPDDR4X SDRAM

Operating system

Raspberry Pi operating system (official)

Ports

2 × USB 3.0, 2 × USB 2.0, Ethernet, 2x micro HDMI, 2 × 4-lane MIPI transceivers, PCIe Gen 2.0 interface, USB-C, 40-pin GPIO header

GPU

VideoCore VII

Initial price

$60

The Raspberry Pi 5 is an inexpensive SBC that supports basic AI execution. It has a cost-effective mobile SOC, the BCM2712, and 2 to 16 GB of RAM.


The AI ​​on the Raspberry Pi 5 is decent, if you know what you’re doing

And that’s the problem: I expect human-level quality from an AI that only has 1 billion parameters. Yes, I think the future of AI lies in embedded devices, running locally on your network. Yes, I think this is how AI will survive, after running out of steam. However, I don’t think it’s fair to consider it an ideal, given the current situation. Currently, AI is designed to work in the cloud. Running it locally is cool, but it’s not completely there yet. However, once the technology improves, I could really see its local execution taking off. And I’m sure that’s where we’re headed.

Running Phi4 on the Radxa Orion O6

Ollama remains the easiest way to start local LLMs, but it is the worst way to continue running them.

Ollama is great to get you started… but don’t stick around.

I think if I had more experience I would be able to tell which AIs are worth using in local setups, as I can already see differences in the areas of expertise of certain models. But I can’t expect AIs to be ready in all areas yet; they are still in their infancy. The companies behind them are taking people’s intellectual property without permission. They are losing money in the levels they are at. But once stable, it’ll be like the Rabbt R1, where it’s a local model, running on your phone or a small network-connected box in your bedroom. It will be available without subscription; instead, you pay for it and own it. If the equipment breaks down, you can replace it, repair it yourself or contact the seller. I think this is what the future holds, but for now, AI is not ready for me. If you have a Pi 5 lying around and want to try AI on it, I highly recommend it; it was an eye-opening experience for me.