Are you tired of the tech giants having access to your personal data? Are you concerned about snooping and telemetry? Then it’s time to take matters into your own hands by running a fully offline GPT model on your own local machine! By doing so, you can have privacy and security, as all processing will occur locally without any third-party involvement. Say goodbye to companies profiting from your personal information and take back control with a completely offline GPT model!
First things first, this post makes use of a configuration that I have access to (i.e. my laptop) but should work for anyone else too (with little tweaks). Below is my machine specs:
- Macbook Pro, M2 Pro Apple Silicon
- 16 Gb RAM
- 512 Gb SDD
- macOS Sonoma 14.1.2
Now, let us dive into setting up an offline, private and local GPT like ChatGPT but using an open source model.
Oh Lama 🦙: Setup Ollama
Note: Github project for Ollama can be found here
Ollama is a frontend built so you can easily get up and running with large language models on your local machine. You can run pre-trained models like Llama 2 and Code Llama, as well as customize and create your own models offline in a private and local environment. This allows you to have full control over your data and models, ensuring that they remain secure and confidential.
UPDATE: Now there is an easier way to setup everything i.e Ollama (App + CLI) and Docker Desktop. Simply run
brew install --cask ollama docker
You can skip the setup instructions for these applications as mentioned below.
To install it first download the Ollama CLI using homebrew
Next download and install Ollama application.
Once downloaded, open the application
When prompted, just select “Open”
It will run a wizard to set everything up including running Ollama in the background. You can see Ollama running as an icon of lama in your menu bar 😄
After installing Ollama, verify that Ollama is running by accessing the following link in your web browser: http://127.0.0.1:11434/
Pull in Large Language Model(LLM)
Ollama is just the front end of talking to LLM models. You will need to download a model to work with.
Note: These models are usualy quite big ranging from 3 Gb to upwards of 60 Gb, based on trillions of tokens they are trained on. You would need to have free space to download the model.
Download one of the models mentioned here, by running the
pull command in Terminal:
<name_of_model>is to be replaced with a valid name of model picked from the Ollama Model Library.
For demonstration purpose, let us download the
You can use Ollama right away with the model you have downloaded, in a Chat style within the terminal by using the
However, it would be much nicer if you could use a UI like ChatGPT and not have to work with Terminal, right?
That is what you will setup next 🤓
Setup Docker Desktop
Next download and install Docker Desktop application.
Once downloaded, open the application and it will run a wizard to set everything up including running Docker in background. You can see Docker running as an icon in your menu bar 😄
Note: You do not need to sign in/sign up anything for this post, unless you wish to.
Setup Ollama Web UI
Ollama Web UI is a ChatGPT-Style Web Interface. This will make it easier for you to talk to your GPT like you are used to, ChatGPT-Style.
To start it up, open Terminal app and run the below command
This command starts the docker container and makes the Ollama Web UI available at port 3000 of localhost.
If you want to verify this, open Docker Desktop dashboard and you will see
At this point, you are done with the setup 😎
You can now access the Ollama Web UI Chat interface at http://localhost:3000.
Bonus: Adding more models
Ollama Model Library provides more than one variation of each model. You can find the other variations under the Tags tab on the model’s page. Also note the size of the model mentioned, to access if the model is not too big in size for your machine’s storage space.
Then go back to the Chat interface at http://localhost:3000 and select the model from drop down