There are a number of AI players in the market right now, including ChatGPT, Google Bard, Bing AI Chat, and many more. However, all of them require you to have an internet connection to interact with the AI. What if you want to install a similar Large Language Model (LLM) on your computer and use it locally? An AI chatbot that you can use privately and without internet connectivity. Well, with the new Alpaca model released by Stanford, you can come close to that reality. Yeah, you can run a ChatGPT-like language model on your PC offline. So on that note, let’s go ahead and learn how to use an LLM locally without the internet.
Run a ChatGPT-Like LLM Locally Without Internet (Private and Secure)
In this article, I have mentioned everything about how to run a ChatGPT-like LLM on a local PC without the internet. You can expand the table below and learn about the steps in detail.
What is Alpaca and LLaMA?
Alpaca is a small AI language model developed by a group of computer scientists at Stanford University. The unique thing about Alpaca is how small and cost-effective it is. With just 7 billion parameters, Alpaca is as good as OpenAI’s text-davinci-003 model. And you can run it on your local computer without requiring an internet connection. That’s pretty cool, right?
But how was it trained? Surprisingly, Alpaca is fine-tuned on LLaMa, Meta’s large language model which recently leaked online. And to train this language model, scientists used OpenAI’s “text-davinci-003” model to generate 52K high-quality self-instruction data. With this dataset, they fine-tuned the LLaMA model using HuggingFace’s training framework and released the Alpaca 7B. You can also use Meta’s LLaMA model, but in my testing, Stanford’s Alpaca LLM performed much better and it’s also quite fast.
What Kind of Hardware Do You Need to Run Alpaca?
You can use Alpaca 7B on any decent machine. I installed Alpaca 7B on my entry-level PC and it worked quite well. To give you some idea, my PC is powered by a 10th-Gen Intel i3 processor with 256GB of SSD and 8GB of RAM. For GPU, I am using Nvidia’s entry-level GeForce GT 730 GPU with 2GB of VRAM.
Even without a dedicated GPU, you can run Alpaca locally. However, the response time will be slow. Apart from that, there are users who have been able to run Alpaca even on a tiny computer like Raspberry Pi 4. So you can infer that the Alpaca language model can very well run on entry-level computers as well.
Set Up the Software Environment to Run Alpaca and LLaMA
Windows
On Windows, you need to install Python, Node.js, and C++ to get started with using a large language model offline on your computer. Here is how to go about it.
1. First, download Python 3.10 (or below) from here. Scroll down and click on “Windows installer (64-bit)” to download the setup file.
2. Launch the setup file and enable the checkbox next to “Add Python.exe to PATH.” Now, install Python with all default settings.
3. After that, install Node.js version 18.0 (or above) from here. Keep everything default while installing the program.
4. Finally, download the Visual Studio “Community” edition from this link for free.
5. Launch the Visual Studio 2022 setup file, and it will initially download some files. After that, a new window will launch. Here, make sure “Desktop development with C++” is enabled.
6. Finally, click “Install” and wait until it completes the installation.
7. I recommend restarting your computer once everything is installed. Next, open “Command Prompt” and run the below commands to check if Python and Node.js are installed successfully. Both should return the version number. You are now good to go.
python --version node --version
Apple macOS
Python generally comes pre-installed on macOS, so you only need to install Node.js (version 18.0 or above). Here is how you can do it:
1. Download the Node.js macOS Installer (version 18.0 or above) from the link here.
2. Next, open the Terminal and run the below command to check if Node.js is installed properly. If you get a version number in return, you are good to go.
node --version
3. Next, check the Python version by running the below command. It should be Python 3.10 or below.
python3 --version
4. If you don’t get output or you happen to have the latest Python version, download Python 3.10 (or below) from here. Scroll down and click on “macOS 64-bit universal2 installer” to download Python. Now, install it on your Mac.
Linux and ChromeOS
On Linux and ChromeOS, you need to set up Python and Node.js before you run offline Alpaca and LLaMA models. Here are the steps to follow.
1. Open the Terminal and run the below command to check the Python version. If it’s Python 3.10 or below, you are all set.
python3 --version
2. In case you have a higher version, you can use the below commands to install Python 3.10 on Linux and ChromeOS.
sudo apt install software-properties-common sudo add-apt-repository ppa:deadsnakes/ppa sudo apt-get update sudo apt-get install python3.10
3. After Python, install Node.js by running the below command.
sudo apt install nodejs
4. After the installation, run the below command to check the Node.js version. It should be 18.0 or higher.
node --version
Install Alpaca and LLaMA Models On Your Computer
Once you have set up Python and Node.js, it’s time to install and run a ChatGPT-like LLM on your PC. Make sure the Terminal detects both python
and node
commands before you proceed.
1. Open the Terminal (in my case, Command Prompt) and run the below command to install the Alpaca 7B LLM model (around 4.2GB disk space required). If you want to install the Alpaca 13B model, replace 7B
with 13B
. The larger model needs 8.1GB of space.
npx dalai alpaca install 7B
2. Now, type “y” and hit Enter. This will start installing the Alpaca 7B model. The whole process will take 20 to 30 minutes, depending on your internet connectivity and model size.
3. After the installation is complete, you will see a screen like this.
4. You can choose to install LLaMA models as well or move to the next step to test the Alpaca model instantly. Remember, LLaMA is much larger in size. Its 7B model takes up to 31GB of space. To install it, run the below command. You can replace 7B
with 13B
, 30B
, and 65B
. The largest model takes up to 432GB of space.
npx dalai llama install 7B
5. Finally, run the below command, and it will start the webserver instantly.
npx dalai serve
6. Use a web browser on your PC and open the below address. This will take you to the web UI where you can test Alpaca and LLaMA models locally and without the internet.
http://localhost:3000
7. Here, you need to choose the “Alpaca 7B” or “LLaMA 7B” model from the “model” drop-down menu in the top-right corner. Since I have only installed the Alpaca 7B model, this is my default.
8. You can now start using this ChatGPT-like language model on your PC without internet connectivity. Replace “PROMPT” with your query and click on “Go”.
9. Here is what the resource usage looks like while running the local Alpaca LLM server on my Windows PC.
10. In case you want to delete the downloaded models to free up disk space, open your user profile directory. Here, the “dalai” folder has all the files, including the model. Deleting the “dalai” folder will free up space immediately.
Use a ChatGPT-Like Service Privately and Completely Offline
So this is how you can run a ChatGPT-like LLM on your local PC and get decent results as well. As time goes by, new and highly-efficient LLM models will be available in the future which can be run on smartphones to small-board computers like Raspberry Pi. Anyway, that is all from us. If you want to use ChatGPT 4 for free, head to our linked article for some amazing resources. And in case you want to train an AI chatbot based on your own documents, we have an in-depth guide ready for you. Finally, if you are facing any problems, let us know in the comment section below.
Hey – I succesfully installed but it’s stuck on loading and not getting any response. Using Macbook air 2021
Can I install it on another drive than c: in Windows?
The installation works on both models, but for some reason I can never get it to work! (no reponses from LLM)
Maybe I’m not willing to give it enough time, or is it really that slow on a specs: 1060 GTX 6GB VRAM GPU, with an AMD 5 1600 6-cores/12-threads 3.5 GHz CPU with 2x6GB 16GB 3200MHz RAM?
Thanks for taking the time to write such a useful but short guide. Unfortunately, it doesn’t cover this issue. Or maybe it wasn’t clear enough. Still really cool!
The first run was with the Alpaca 13B, but it never worked to give me an answer to 1+1 or two questions. One was about “the meaning of life” for fun and the other was something else. I forget what it was. Right now I’m installing the LLaMa 7B. I’ll get back to you with more information as it develops, aka as soon as I test around with LLaMa! It’s installation is way longer…
Can I fine-tune the model to be better at say VBA to Python translations after installing on Linux?
This guide dose not work its waste of time gets that invalid model file.
Don’t waste your time look for something else
Dalai version Llama 7B works fine, but can it compare locally between two files of text? Is it possible to print the content of a text file? Using Cat, copy, or print, I am unable to implement these commands. Has anyone tried and is still alive to report how?
Also having the “bad magic” problem.
llama_model_load: loading model from ‘models/13B/ggml-model-q4_0.bin’ – please wait …
llama_model_load: invalid model file ‘models/13B/ggml-model-q4_0.bin’ (bad magic)
main: failed to load model from ‘models/13B/ggml-model-q4_0.bin
not working 🙁
C:\Users\chris>npx dalai alpaca install 7B
Need to install the following packages:
dalai@0.3.1
Ok to proceed? (y) y
npm WARN cleanup Failed to remove some directories [
npm WARN cleanup [
npm WARN cleanup ‘C:\\Users\\chris\\AppData\\Local\\npm-cache\\_npx\\3c737cbb02d79cc9\\node_modules’,
npm WARN cleanup [Error: EPERM: operation not permitted, rmdir ‘C:\Users\chris\AppData\Local\npm-cache\_npx\3c737cbb02d79cc9\node_modules\dalai’] {
npm WARN cleanup errno: -4048,
npm WARN cleanup code: ‘EPERM’,
npm WARN cleanup syscall: ‘rmdir’,
npm WARN cleanup path: ‘C:\\Users\\chris\\AppData\\Local\\npm-cache\\_npx\\3c737cbb02d79cc9\\node_modules\\dalai’
npm WARN cleanup }
npm WARN cleanup ]
npm WARN cleanup ]
npm notice
npm notice New minor version of npm available! 9.5.1 -> 9.6.6
npm notice Changelog: https://github.com/npm/cli/releases/tag/v9.6.6
npm notice Run npm install -g npm@9.6.6 to update!
npm notice
Please Help – Thanks!
There’s some sort of problem with it now.
llama_model_load: loading model from ‘models/7B/ggml-model-q4_0.bin’ – please wait …
llama_model_load: invalid model file ‘models/7B/ggml-model-q4_0.bin’ (bad magic)
main: failed to load model from ‘models/7B/ggml-model-q4_0.bin’X
Any fix yet, Jason? Would love to know! Thanks.
Echoing what Fahim said, I’d like to run the Alpaca LLM against my custom dataset offline. Can you advise? Thanks!
npx dalai alpaca install 7B command stops itself while running, idk why its happening. HELP?
Have you found a solution? Mine does the same
I had to upgrade my npm to the latest version and used these instructions to download, then it all came down ok
# Install “dalai” and “alpaca” packages
npm install -g dalai alpaca
# Execute the “dalai alpaca install 7B” command
npx dalai alpaca install 7B
Hi Arjun! Thanks for the excellent write-up.
I was wondering if it’s possible to train the private and offline models with a custom dataset. Similar to this article here: https://beebom.com/how-train-ai-chatbot-custom-knowledge-base-chatgpt-api/amp/ but not sending data to OpenAI API.
I’ll appreciate any feedback, thanks!
I have the same concern.
If you find a solution please mail me to khalidreemy@gmail.com
Thanks alot