Arnav Gupta's Tech Blog

Arnav Gupta's Tech Blog

Running llama.cpp (compiled from source) on AMD Strix Halo 395

PublishedJanuary 31, 2026

•1 min read

Running llama.cpp (compiled from source) on AMD Strix Halo 395

Just a quick doc/note/tutorial for referencing myself later.

Here's how to get llama.cpp running with Vulkan support on AMD AI Max 395 (Strix Halo) based devices. I tried it on a Beelink GTR 9 Pro, but should work for Framework Ddesktop too.

Installing a few Prerequisites

sudo apt install amd-smi rocminfo glslc  libvulkan-dev vulkan-tools   mesa-vulkan-drivers   clinfo

Download and build llama.cpp

Get it from github

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp

Build it for Vulkan support

cmake -B build \
  -DGGML_VULKAN=ON \
  -DCMAKE_BUILD_TYPE=Release

cmake --build build -j$(nproc)

Run a local LLM model

Once it is built, we can run a model

./build/bin/llama-cli \
  -m ~/.lmstudio/models/lmstudio-community/GLM-4.7-Flash-GGUF/GLM-4.7-Flash-Q4_K_M.gguf \
  -ngl 999 \
  -c 4096 \
  -t $(nproc) \
  --color on \
  -p "Explain how Vulkan helps LLM inference on AMD GPUs."

#vulkan #llama #graphics #drivers #inference

564 views

Comments

Join the discussion

No comments yet. Be the first to comment.

More from this blog

Under the Hood: How 2FA TOTP Authenticator Apps Work

And a walkthrough of how TwoFac Generates Your 2FA Codes

Mar 8, 202618 min read2.0K

Under the Hood: How 2FA TOTP Authenticator Apps Work

Architecting TwoFac: My Journey into Kotlin Multiplatform Module Structure

Building a Kotlin Compose Multiplatform app for every possible platform - desktop, mobile, watch, web, browser extensions, CLI and more...

Mar 3, 20267 min read1.3K

Architecting TwoFac: My Journey into Kotlin Multiplatform Module Structure

How Personal AI Agents and Agent Orchestrators like OpenClaw or GasTown are Made

Over the last few months, projects like Gas Town by Steve Yegge and OpenClaw by Peter Steinberger have made “AI agent orchestrators” feel suddenly mainstream. It is tempting to treat them as a new kind of intelligence, but under the hood they are sti...

Feb 18, 202613 min read3.3K

How Personal AI Agents and Agent Orchestrators like OpenClaw or GasTown are Made

env.sync.local - Syncing API keys and secrets between devices in my home LAN

My desk is a heterogeneous graveyard of differing architectures. I have a MacBook Pro for daily work, a Razer laptop that mostly serves as an expensive space heater for Windows testing, and a Beelink MiniPC that oscillates between Linux and Windows d...

Feb 10, 20264 min read100

env.sync.local - Syncing API keys and secrets between devices in my home LAN

Arnav Gupta's Tech Blog

20 posts