2025 June 7

Portable and private AI for Windows PCs without GPUs

While reading on Hacker News late afternoon, I remembered I had this idea to have a Large Language Model (LLM) in a USB drive, plug it to any Windows machine, run it from there without copying it to the laptop/desktop or without installing anything, then chat with it. A portable, private AI for non-technical Windows users. No need for internet connection and GPUs. No need to login as well.

Big thanks to llama.cpp which makes this possible. How? Basically, I just gathered the relevant files needed to run a llama-server, reuse a batch script, then put them all in a folder in the USB drive. Finally, I can now share it with family and friends.

Instructions to package a portable AI

The files mentioned here are the ones that I have personally saved in my USB drive.

Step 1. Download a release from llama.cpp. For example, download the llama-b5595-bin-win-cpu-x64.zip.

The naming means it's Windows, CPU-only, 64-bit.

Step 2. Extract the ZIP file. These are the expected contents.

Extracted to llama-b5595-bin-win-cpu-x64 folder

Step 3. Create a folder. Name it as you like. I named mine as portable-ai.

Step 4. Copy the following files to the portable-ai folder.

ggml.dll
ggml-base.dll
ggml-cpu-x64.dll
ggml-rpc.dll
libcurl-x64.dll
libomp140.x86_64.dll
llama.dll
llama-server.exe
mtmd.dll

Step 5. Download a Large Language Model or the AI. Below is a list of AI models and their download link. Choose a model and download the one that has “Q4_K_M” in its name. After downloading, save the AI model to the portable-ai folder as well.

AI Model	Download
gemma-2-2b-it-abliterated	Link
SmolLM2-1.7B-Instruct	Link
Qwen_Qwen3-1.7B	Link

I downloaded a `Qwen_Qwen3-1.7B-Q4_K_M.gguf`.

Step 6. Create a batch script file (a file that has a .bat extension). To do this, open Notepad. Copy and paste the code below. Save the file as app_launch_AI.bat. Note that when saving the file, the file type SHOULD NOT BE “.txt”. In the “Save as type” dropdown, select “All Files” just as shown in the image below. This file should also be in the portable-ai folder.

Select "All Files" in the "Save as type" dropdown.

This script runs the llama-server and opens the browser for the llama.cpp UI. If you downloaded the gemma-2-2b-it-abliterated or SmolLM2-1.7B-Instruct, make sure to rename the model (currently “Qwen_Qwen3-1.7B-Q4_K_M.gguf”) in the script to the corresponding name of the AI model file you downloaded. Here’s the code:

Set the ai_model in the script to Qwen_Qwen3-1.7B-Q4_K_M.gguf or gemma-2-2b-it-abliterated-Q4_K_M.gguf or smollm2-1.7b-instruct-q4_k_m.gguf

@echo off
setlocal

set ai_model=Qwen_Qwen3-1.7B-Q4_K_M.gguf

:: Start the server in a separate window
start "Local AI server" llama-server.exe -m %ai_model% --ctx-size 4096

echo Waiting for server to start...
set max_retries=15
set retry_delay=2

:: Check if server is ready
set server_ready=0
for /l %%i in (1,1,%max_retries%) do (
    >nul 2>&1 powershell -command "$response = try { Invoke-WebRequest http://localhost:8080/ -UseBasicParsing -DisableKeepAlive -TimeoutSec 1 } catch {}; if ($response.StatusCode -eq 200) { exit 0 } else { exit 1 }"
    if %errorlevel% equ 0 (
        set server_ready=1
        goto server_up
    )
    timeout /t %retry_delay% /nobreak >nul
    echo Checking... (Attempt %%i/%max_retries%)
)

:server_up
if %server_ready% equ 1 (
    echo Server is ready! Opening browser...
    start "" "http://localhost:8080/"
) else (
    echo WARNING: Server didn't respond after %max_retries% attempts
    echo Opening browser anyway...
    start "" "http://localhost:8080/"
)

endlocal

Step 7. The portable-ai folder should have these 11 items. Copy the folder to your USB drive.

app_launch_AI.bat
ggml.dll
ggml-base.dll
ggml-cpu-x64.dll
ggml-rpc.dll
libcurl-x64.dll
libomp140.x86_64.dll
llama.dll
llama-server.exe
mtmd.dll
Qwen_Qwen3-1.7B-Q4_K_M.gguf

Step 8. Click the app_launch_AI.bat to run the AI. This will automatically open the browser for the user interface to the AI.

Important notes

Item 1. When starting the AI, a command prompt will appear. Do not close this command prompt while using the AI. Close this only after using the AI.

Command prompt showing the Local AI server.

Item 2. The user interface is opened through a browser. The address is at localhost:8080. This will be available until the command prompt (where the server is running) is closed. It looks like this:

User interface for the portable, private AI.