Welcome

Cometforge NPC — Documentation

Give your game’s NPCs persistent memory and a real LLM brain. Drop a component on a pawn, call one Blueprint node, and the character remembers the player across sessions instead of resetting every conversation. Choose where the brain runs — fully on-device, a cloud API, or your own server.

Requirements

Unreal Engine 5.7
Windows (Win64). On-Device mode uses Vulkan.

1. Install & enable

Installed from Fab, the plugin lives under Engine/Plugins/Fab/CometforgeNPC. Enable it in Edit → Plugins → AI → Cometforge NPC, then restart the editor.

2. Pick a brain

Project Settings → Plugins → Cometforge NPC

On-Device (default) — works out of the box. The bundled Ministral 3 3B Instruct (Q4_K_M, ~2 GB, Apache-2.0) ships under the plugin’s Content/Models and is found automatically. Press Play and talk to an NPC. To use a different model, set Model Path to any llama.cpp-compatible GGUF. One model serves the whole cast.
Cloud — set the base URL + API key of any OpenAI-compatible provider.
BYOM — point the base URL at your own server (LM Studio, llama-server). No key needed.

Mode	Model runs	Needs internet	Ships in the game
On-Device (bundled GGUF)	inside the game	No	Yes — model included
Cloud API	a remote server	Yes	No (connector only)
BYOM (your server)	your machine/server	LAN/local	No (connector only)

3. Use it (Blueprint)

Get the Cometforge Subsystem (a Game Instance Subsystem).
Call Talk — NpcId, System (the character prompt), Speaker, Text. It recalls that NPC’s memories, generates a reply, and records the exchange. With Auto Start on (default), you never call Load Model yourself.
Or drop the Cometforge Dialogue widget into your UI for a working chat box with zero node wiring — open it from any interact event with Open Dialogue.

Core nodes: Talk, Remember, Recall, Open Dialogue.

4. Memory & save games

Project Settings → Cometforge NPC → Memory Persistence

Persistent (default) — one world memory, autosaved continuously. NPCs never forget, across sessions, reloads, and slots. Zero config.
Save Game — memory follows your save slots. Call Save Memory To Slot beside your own Save Game to Slot node (same slot name), and Load Memory From Slot beside Load Game from Slot. Reload an earlier save and each NPC knows exactly what it knew then. Stored as <SlotName>_CometforgeMemory.

Extra nodes: Reset All Memory (your “New Game” node), Delete Memory In Slot, and Get / Restore Memory Snapshot.

Known limitation

Core memory storage, persistence, and per-turn memory injection are deterministic and run with every model in every mode. Model-initiated tool use (an NPC deciding on its own to recall a memory or take an action) depends on the chosen model’s training — treat it as a bonus, not a guarantee, especially at small sizes. For Cloud/BYOM, a reasoning-class model performs best.

Third-party & licensing

llama.cpp + ggml — MIT (Source/ThirdParty/LlamaCpp)

Support

Questions or studio partnerships: travis@tntholley.com