Ollama allows you to run language models on your own computer: Requests to the local model are processed without sending text to a third-party cloud chat. In this instruction, we will install Ollama on macOS or Windows, launch the first model, check the API and, if desired, add the Open WebUI graphical interface.

Current as of June 21, 2026. Latest stable release – Ollama v0.30.10, published June 17. The interface and commands may change in future versions.

What is Ollama and why is it needed?

Ollama is a free tool for downloading and running open models locally. Its official library includes Qwen, Llama, DeepSeek, Gemma, Mistral, and other model families. Ollama manages model files, uses available GPU acceleration, and provides a local API at http://localhost:11434.

After loading the local model, the Internet is not needed for normal dialogue. However, it is required for installation, updates and downloading of models. In addition, Ollama has cloud models: if privacy is a concern, choose a local model tag and check where the request is being made.

System requirements: how much memory will be needed

The official documentation specifies operating system and GPU compatibility requirements, but does not specify a universal RAM or VRAM minimum. Memory consumption depends on model size, quantization, and context length. Therefore, the table below is a practical guide and not a guarantee of performance.

Configuration	Where to start	Comment
8 GB RAM	`llama3.2:1b` or `llama3.2:3b`	Suitable for learning the basics; close memory-heavy apps
16 GB RAM / unified memory	`qwen3:4b`, sometimes `qwen3:8b`	A practical minimum for compact models
32 GB	8B–14B models	More headroom for context and other running apps
32–64 GB or more	`qwen3:30b`, `qwen3-coder:30b`	The model file is about 19 GB, but additional memory is required

macOS: Requires macOS Sonoma 14 or later. Apple Silicon uses the GPU through Metal and shared memory; Intel Mac is supported in CPU mode.
Windows: Requires Windows 10 22H2 or later. The installer works without administrator rights.
NVIDIA: By official compatibility list You need compute capability 5.0+ and an up-to-date driver; for new versions, the documentation indicates driver 531+.
Disk: A Windows installation requires at least 4 GB, and models take up from hundreds of megabytes to tens and hundreds of gigabytes.

Step 1: Install Ollama on macOS

The clearest way is to open official download page, download Ollama.dmg, move the application to the Applications folder and run it. When you launch it for the first time, Ollama will prompt you to add the command ollama to the system PATH.

Ways to install Ollama on macOS and verify the version in Terminal — The official DMG is the simplest way to install Ollama on macOS, while the Terminal command is convenient for developers.

An alternative official method is to install with one command:

curl -fsSL https://ollama.com/install.sh | sh

Ollama is also available through Homebrew, but it is a third-party manager package. If you’re already using Homebrew:

brew install ollama

After installation, close and reopen Terminal, then check the version:

ollama --version

Step 2: Install Ollama on Windows

On Windows, download OllamaSetup.exe from the official page. In the v0.30.10 release, the installer is about 1.3 GB. It does not require administrator rights and installs in the user’s home directory by default.

Installing Ollama on Windows with the installer or PowerShell — Install Ollama on Windows with OllamaSetup.exe or the official PowerShell command.

Official installation via PowerShell:

irm https://ollama.com/install.ps1 | iex

Once complete, open a new PowerShell window and run:

ollama --version

How to transfer models to another drive

If there is little space on drive C, open “Change environment variables for your account”, create a variable OLLAMA_MODELS and indicate, for example, D:\OllamaModels. After saving, close Ollama completely in the system tray and launch it again. This procedure is described in Ollama documentation for Windows.

Installation on Linux

For most Linux systems, the official project offers the same installation script:

curl -fsSL https://ollama.com/install.sh | sh

After installation, check the service with the commands ollama --version And systemctl status ollama. For NVIDIA, AMD and experimental Vulkan, check the latest hardware support page.

Step 3. Download and run the first model

The ollama run command downloads the model and then opens an interactive chat. On a computer with 8 GB RAM, start with the compact Llama 3.2:

ollama run llama3.2:3b

With 16 GB of memory, try qwen3:4b first, then qwen3:8b if the system has enough headroom:

ollama run qwen3:4b
ollama run qwen3:8b

Comparison of local Ollama models by size and purpose — Start with a small model and move to a larger one only when you have enough spare RAM or unified memory.

Enter a question after the prompt >>>. To exit use /bye or combination Ctrl+D. Useful commands:

ollama list
ollama ps
ollama pull qwen3:8b
ollama rm qwen3:8b

ollama list — show downloaded models.
ollama ps — show models loaded into memory.
ollama pull — download or update the model.
ollama rm — delete the model from the disk.

Which models to choose in June 2026

Model	Load Size	Suitable for
`llama3.2:3b`	2.0 GB	First launch, summarization, simple tasks
`qwen3:4b`	2.5 GB	Compact universal assistant
`qwen3:8b`	5.2 GB	Texts, analysis, multilingual tasks
`deepseek-r1:8b`	5.2 GB	Reasoning, mathematics and analytics
`qwen3:30b`	19 GB	More complex universal tasks
`qwen3-coder:30b`	19 GB	Programming and working with more context

The download sizes come from the official model cards for Qwen 3, Qwen3-Coder, DeepSeek-R1, and Llama 3.2. They are file sizes, not exact RAM consumption figures.

Step 4: Check your local API

When Ollama is running, the API is available locally on port 11434. Checking the list of models:

curl http://localhost:11434/api/tags

Example of a single request to a model:

curl http://localhost:11434/api/generate \
  -d '{"model":"qwen3:4b","prompt":"Explain what a local LLM is","stream":false}'

Step 5: Install Open WebUI

If the terminal is inconvenient, Open WebUI adds an interface similar to the usual chat. First install Docker Desktop then run the command from official Open WebUI instructions:

docker run -d -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
  -v open-webui:/app/backend/data \
  --name open-webui --restart always \
  ghcr.io/open-webui/open-webui:main

Local Open WebUI interface connected to Ollama — After starting the container, open `http://localhost:3000` and create a local administrator account.

Tag :main updated over time. For a stable working installation, Open WebUI recommends pinning a specific version of the image. Also, do not open port 3000 to the Internet without authentication, HTTPS and basic server security.

What to do if Ollama doesn’t work

Command not found: restart the terminal; on macOS, check for the link in /usr/local/bin.
The model is too slow: choose a smaller model, reduce the context, and close memory-hogging applications.
Using CPU instead of GPU: update the driver and check the video card with the official support list.
Not enough space: remove unnecessary models via ollama rm or transfer the model catalog.
Open WebUI doesn’t see Ollama: check if it responds http://localhost:11434/api/tags, and whether it is set correctly OLLAMA_BASE_URL.

FAQ

Is Ollama completely free?

The tool itself is free. Each model has its own license that must be verified before commercial use. Cloud functions may also have separate terms and conditions.

Can I use Ollama without a video card?

Yes, local models can run on the CPU, but generation is usually noticeably slower. Start with model 1B–3B.

Does Ollama work without the Internet?

After installing and loading the local model – yes. The Internet will be needed to download and update models, as well as for cloud functions.

Which model is better for the first launch?

For 8GB RAM start with llama3.2:3b. For 16 GB try qwen3:4band then qwen3:8b, if the system retains sufficient memory.

Conclusion

For the first acquaintance, just install Ollama in the official way, check ollama --version and launch a compact model. Don’t start with 30B models just because they look more powerful: a small model that fits entirely in available memory often gives a better, faster local experience.

The cover and step-by-step images are designed as impersonal display screens. They do not contain real accounts, keys, home directories or other personal data.

How to Install Ollama in 2026: Step-by-Step Guide

What is Ollama and why is it needed?

System requirements: how much memory will be needed

Step 1: Install Ollama on macOS

Step 2: Install Ollama on Windows

How to transfer models to another drive

Installation on Linux

Step 3. Download and run the first model

Which models to choose in June 2026

Step 4: Check your local API

Step 5: Install Open WebUI

What to do if Ollama doesn’t work

FAQ

Is Ollama completely free?

Can I use Ollama without a video card?

Does Ollama work without the Internet?

Which model is better for the first launch?

Conclusion

Comments on this article

Leave a Comment Cancel Reply

What is Ollama and why is it needed?

System requirements: how much memory will be needed

Step 1: Install Ollama on macOS

Step 2: Install Ollama on Windows

How to transfer models to another drive

Installation on Linux

Step 3. Download and run the first model

Which models to choose in June 2026

Step 4: Check your local API

Step 5: Install Open WebUI

What to do if Ollama doesn’t work

FAQ

Is Ollama completely free?

Can I use Ollama without a video card?

Does Ollama work without the Internet?

Which model is better for the first launch?

Conclusion

Comments on this article

Leave a Comment Cancel Reply

How to Create a Codex Skill for Engaging Posts

Reports: Anthropic CEO urges G7 leaders not to fragment AI policy

Historic Milestone: Bots Surpass Humans in Web Traffic for the First Time