This page looks best with JavaScript enabled

Local GPTs: Off the Grid, On Your Machine

 ·  โ˜• 4 min read  ·  โœ๏ธ Nishant
Local GPTs Off the Grid on your Machine

Are you tired of the tech giants having access to your personal data? Are you concerned about snooping and telemetry? Then it’s time to take matters into your own hands by running a fully offline GPT model on your own local machine! By doing so, you can have privacy and security, as all processing will occur locally without any third-party involvement. Say goodbye to companies profiting from your personal information and take back control with a completely offline GPT model!

First things first, this post makes use of a configuration that I have access to (i.e. my laptop) but should work for anyone else too (with little tweaks). Below is my machine specs:

  • Macbook Pro, M2 Pro Apple Silicon
  • 16 Gb RAM
  • 512 Gb SDD
  • macOS Sonoma 14.1.2

Now, let us dive into setting up an offline, private and local GPT like ChatGPT but using an open source model.

Oh Lama ๐Ÿฆ™: Setup Ollama

Screenshot

Note: Github project for Ollama can be found here

Ollama is a frontend built so you can easily get up and running with large language models on your local machine. You can run pre-trained models like Llama 2 and Code Llama, as well as customize and create your own models offline in a private and local environment. This allows you to have full control over your data and models, ensuring that they remain secure and confidential.

UPDATE: Now there is an easier way to setup everything i.e Ollama (App + CLI) and Docker Desktop. Simply run

brew install --cask ollama docker
You can skip the setup instructions for these applications as mentioned below.

To install it first download the Ollama CLI using homebrew

Screenshot

1
brew install ollama

Next download and install Ollama application.

Screenshot

Once downloaded, open the application

Screenshot

When prompted, just select “Open”

It will run a wizard to set everything up including running Ollama in the background. You can see Ollama running as an icon of lama in your menu bar ๐Ÿ˜„

Screenshot

After installing Ollama, verify that Ollama is running by accessing the following link in your web browser: http://127.0.0.1:11434/

Screenshot

Pull in Large Language Model(LLM)

Ollama is just the front end of talking to LLM models. You will need to download a model to work with.

Note: These models are usualy quite big ranging from 3 Gb to upwards of 60 Gb, based on trillions of tokens they are trained on. You would need to have free space to download the model.

Download one of the models mentioned here, by running the pull command in Terminal:

Screenshot

1
ollama pull <name_of_model>

where <name_of_model> is to be replaced with a valid name of model picked from the Ollama Model Library.

For demonstration purpose, let us download the llama2 model.

Screenshot

1
ollama pull llama2

You can use Ollama right away with the model you have downloaded, in a Chat style within the terminal by using the run command

1
ollama run llama2

Screenshot

However, it would be much nicer if you could use a UI like ChatGPT and not have to work with Terminal, right?

That is what you will setup next ๐Ÿค“

Setup Docker Desktop

Next download and install Docker Desktop application.

Screenshot

Once downloaded, open the application and it will run a wizard to set everything up including running Docker in background. You can see Docker running as an icon in your menu bar ๐Ÿ˜„

Screenshot

Note: You do not need to sign in/sign up anything for this post, unless you wish to.

Setup Ollama Web UI

Ollama Web UI is a ChatGPT-Style Web Interface. This will make it easier for you to talk to your GPT like you are used to, ChatGPT-Style.

Screenshot

To start it up, open Terminal app and run the below command

1
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v ollama-webui:/app/backend/data --name ollama-webui --restart always ghcr.io/ollama-webui/ollama-webui:main

This command starts the docker container and makes the Ollama Web UI available at port 3000 of localhost.

If you want to verify this, open Docker Desktop dashboard and you will see

Screenshot

At this point, you are done with the setup ๐Ÿ˜Ž

You can now access the Ollama Web UI Chat interface at http://localhost:3000.

Bonus: Adding more models

Ollama Model Library provides more than one variation of each model. You can find the other variations under the Tags tab on the model’s page. Also note the size of the model mentioned, to access if the model is not too big in size for your machine’s storage space.

Screenshot

1
ollama pull llama2:13b

Then go back to the Chat interface at http://localhost:3000 and select the model from drop down

How to Use - Walkthrough

Share on
Support the author with

Nishant Srivastava
WRITTEN BY
Nishant
๐Ÿ‘จโ€๐Ÿ’ป Android Engineer/๐Ÿงข Opensource enthusiast

What's on this Page