How to run a local LLM from your PC

How to run a local LLM from your PC
Amaar Chowdhury Updated on by

Video Gamer is reader-supported. When you buy through links on our site, we may earn an affiliate commission. Prices subject to change. Learn more

If you’re wondering how to run a local LLM from your PC at home, this will be the comprehensive guide detailing exactly how to do it.

An LLM (large language model) is a language based artificial intelligence which can interpret text inputs and react to them through reasoning and logic. An example of an LLM is GPT-4, which is the most powerful language model released to date … that we know of. While running a language model from your home PC won’t be anywhere near as fast as what you’d expect from a cloud-based LLM server, you’re going to have a few benefits that you’d never get from a private company.

For one, you’re going to have total control over the output of your AI, without any censorship or guardrails in place. Your conversation history will be completely your own, eradicating the possibility of any privacy or security concerns, which is one of the most attractive reasons to give this a go.

Here’s how you can run a local LLM from your home PC

We can thank /u/YearZero from Reddit for this guide on how to run a local LLM from your home computer.

  • Create a new folder on your PC
  • Download koboldcpp and add to the newly created folder
  • Head on over to huggingface.com and download an LLM of your choice
    • Preferably, a smaller one which your PC can handle.
    • You will need to check your PCs RAM and the memory requirements of the LLM for this.
  • Put the LLM file into the folder
  • Run koboldcpp
  • In the Threads entry column, choose how many threads your PC has. You can find this out by using the dxdiag tool, which will tell you about your computer hardware
  • Check Streaming Mode
  • Check Use SmartContext
  • Hit launch and locate the LLM .bin file
  • The LLM will launch in your browser (It is offline)

Advantages to running a local LLM

As mentioned above, there’s a few clear reasons why you might want to run an LLM from your home PC.

Privacy

One of the main concerns with ChatGPT is that it’s run by a private company who have access to all of the data you input. With a local LLM, all of the prompts you enter begin and end with you, as the AI certainly isn’t going to share this information with anyone.

Guardrails

An AI locally can have as many, or as little, guardrails as you want. You could, for example, have the AI say or do anything you want. This is one of the main reasons you might want to try out a local LLM.

Offline

As the AI runs entirely from your computer’s own hardware, you’re going to be able to make use of it offline whenever you want.

Disadvantages to running a local LLM

While there are many reasons why you might want to run a local LLM to work-around the restrictions imposed by privately upheld AI services, there are also a few important things you should know that might deter you from trying it. Here they are:

Hardware requirements

You’re going to a PC with at least 12 or 16GB RAM to give this a go. While the LLM alone might need 7GB of free RAM to run at the very least, Windows and other services are going to eat up the rest. You will also need a beefy CPU, and if you have an AMD or Nvidia graphics card, you can offload much of the data processing there.

Speed

A locally run LLM will not function anywhere near as fast as an online, cloud-based LLM. There’s little you can do about this, unless you own thousands of GPUs and have the space and energy to run one from there.

Cover image generated with Stable Diffusion, then edited in-house.