Here's a step-by-step guide on how you can install and use DeepSeek R-1 on your local system.
Everyone seems to be talking about DeepSeek R-1, the new open-source AI language model made by a Chinese AI firm, DeepSeek. Some users claim it’s on par with, or even better than, OpenAI’s o1 in terms of reasoning capabilities.
Currently, DeepSeek is free to use, which is great news for users, but it does raise some questions. With the surge in user volume, how are they managing the server costs?
Hardware running costs cannot be cheap, right?
The one logical here would be — data. Data is the lifeblood of AI models. They’re probably collecting user data in some way that benefits their quant trading model or for some other form of monetization.
So, if you’re concerned about data privacy but still still want to use R1 without sharing your data, the best way is to run the model locally.
A couple of days back, Deepseek R-1 was unveiled as a fully open-sourced model, meaning anyone can take the underlying codebase, adapt it, and even fine-tune it to their own needs.
From a technical standpoint, Deepseek R-1 (often abbreviated as R1) stems from a large base model called DeepSeek-V3. The lab then refined this model through a combination of supervised fine-tuning (SFT) on high-quality human-labeled data and reinforcement learning (RL).
The result is a chatbot that can handle intricate prompts, reveal the reasoning steps behind complex questions (sometimes more transparently than other models), and even render code in the chat interface for quick testing.
It’s honestly very impressive, especially for a model that’s open-source.
To run DeepSeek R-1 locally, we’ll be using a tool called Ollama.
Ollama is a free, open-source tool that allows users to run large language models (LLMs) locally on their computers. It is available for macOS, Linux, and Windows.
Head to the official Ollama website and click on the “Download” button. Install it on your system.
To confirm that the installation is successful, open a terminal and run the following command:
ollama -v
You should see the version number of the Ollama instead of an error.
Under the Models tab, search for the keyword “deepseek” and you should see the “deepseek-r1” on the first item on the search list.
Click on it and down the Models section, you’ll notice that there are multiple model sizes from 5 billion to 671 billion parameters. As a rule of thumb, larger models require more powerful GPUs to run.
Smaller models like the 8 billion parameter version can run on GPUs with 8GB of VRAM. Larger models need significantly more resources (refer to the VRAM and GPU requirements section below).
To download and run the 8 billion parameter model, use the following command:
ollama run deepseek-r1:8b
The model will start downloading (around 4.9GB). Ensure you have enough disk space before proceeding.
Once downloaded, the model will run locally on your machine. You can chat with it immediately.
Let’s test it with this example prompt:
Prompt: What is DeepSeek R-1?
Response: DeepSeek-R-1 is an AI assistant developed by the Chinese company DeepSeek. It is designed to provide responses and assistance across a wide range of topics, including but not limited to mathematics, coding, natural language processing, and more. If you have any questions or need help with something, feel free to ask!
Awesome. It’s fast and it still works even if I disconnect my laptop from the Wifi. Note that even if you are connected to the internet, it still cannot access the web.
Prompt: What’s latest price of AAPL?
Response: As an AI assistant, I don’t have real-time data access, so I can’t provide the latest stock price for Apple Inc. (AAPL). For the most accurate and up-to-date information, I recommend checking financial news platforms or your brokerage service.
Other things Ollama can do:
The VRAM requirements for DeepSeek-R1 depend on factors like model size, parameter count, and quantization techniques. Below is a detailed overview of the VRAM needs for DeepSeek-R1 and its distilled models, along with recommended GPUs:
Key notes on VRAM usage:
Sure, the web chatbot and mobile app for DeepSeek are free and incredibly convenient. You don’t need to set anything up, and features like DeepThink and web search are baked right in. But there are a few reasons why running it locally might be a better choice:
Privacy
Offline Access
Future-Proofing
Flexibility
At this point, it’s still unclear how DeepSeek handles user data. If you’re not too worried about data privacy, using the web or mobile app might be the way to go since they’re easier to use and offer features like DeepThink and web search. But if you’re someone who cares about where your data ends up, running the model locally is a good alternative to consider.
DeepSeek models are designed to run well even on hardware that isn’t super powerful. While larger models like DeepSeek-R1-Zero need distributed GPU setups, the distilled versions make it possible to run things smoothly on a single GPU with much lower requirements.
If you don’t like using the terminal, you can always add a simple UI with tools like Gradio or Chatbox AI. I’ll write a guide on setting that up in the next article. For now, I hope this post helps you get started. Let me know your thoughts, or if you’ve run into any issues, feel free to share them in the comments.
Flux Labs AI is now powered by Flux Pro 1.1 Ultra image models from Black Forest Labs for the best AI images. Create a custom LoRA for your brand, products, and AI portrait photos.
Software engineer, writer, solopreneur