Logo
Logo

Programming by Stealth

A blog and podcast series by Bart Busschots & Allison Sheridan.

PBS Tidbit 10 of Y: Run LLMs Locally with Ollama

19 Dec 2024

Steve Mattan joins Allison in a conversation about how he is running large language models locally on his Mac using a variety of open source tools. These shownotes will not be a step-by-step guide but combined with Steve’s explanation and the links he provides will give you a bit of a roadmap on how to do it yourself.

Matching Podcast Episode

You can also Download the MP3

Read an unedited, auto-generated transcript with chapter marks: PBS_2024_12_21

INTRODUCTION

Steve Mattan is a listener to Programming By Stealth who has contributed several fixes and valuable suggestions through GitHub to the project. He also participates in our Slack at podfeet.com/slack in the Programming By Stealth channel.

Steve has a Bachelor’s degree in Physics and a Master’s degree in Computer Science, but as soon as he finished his Computer Science degree, the company that paid for it put him into management. That hasn’t kept him from playing around with code for fun. He made a post on Slack that was the genesis for our conversation.

Steve’s Original Post in Slack

I’ve recently started learning Python, using the book Python Crash Course, 3e by Eric Matthes. The book recommends using VSCode as an editor and since I wanted to try it out that’s what I’m using. And like most people interested in tech, I’m intrigued by LLMs. Alas, as a hobby programmer, I cannot justify a GitHub Copilot subscription.

Unrelated to my learning Python, but related to my interest in LLMs, I had recently started playing with Ollama, a Free and Open Source (FOSS) tool for running LLMs locally. My main Mac is an M1 Max Studio with 64 GB RAM, more than enough to do so. Ollama is a command line tool, but there are GUI front ends available. I’m using Enchanted as a macOS native front end, and use Keyboard Maestro to launch Enchanted when I launch Ollama.

Related to my using VSCode to learn Python and to my use of Ollama is another FOSS tool called Continue. Continue is an extension for VSCode that allows one to use LLMs as a code assistant. Continue works with Ollama. So now Keyboard Maestro launches Ollama (which launches Enchanted) when I launch VSCode.

And I’m having a blast. It is somewhat spooky that Continue is suggesting answers to the exercises in the Python Crash Course via autocomplete as if it is reading the text along with me.

Note that Continue can be used JetBrains in addition to VSCode, and with non-local LLMs, including Claude Sonnet 3.5, Llama 3.1, GPT-4o, and Gemini 1.5.

Suggestions from Steve on interacting at the shell:

If you type ollama at the zsh prompt in your terminal app of choice you’ll get:

```~ ➤ ollama % ollama


```zsh
% ollama
Usage:
  ollama [flags]
  ollama [command]

Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  stop        Stop a running model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  ps          List running models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command

Flags:
  -h, --help      help for ollama
  -v, --version   Show version information

Use "ollama [command] --help" for more information about a command.

The rm command will remove the model:

 % ollama rm llama3.2
deleted 'llama3.2'

A useful command not shown in that list is “/bye” which ends the Ollama session and returns you to the shell prompt.

With respect to model sizes, if you navigate to the main ollama page at ollama.com, and click the “Models” link at the top right, you can search for models. You see that llama3.2 comes in two sizes, 1B and 3B (1 billion and 3 billion parameters respectivley.) Click on the model name, and it will take you to a page where you can copy the ollama run command for your selected size. (Note that “latest” does not mean last updated but rather last downloaded.)

Join the Community

Find us in the PBS channel on the Podfeet Slack.

Podfeet Slack