Give your game’s NPCs persistent memory and a real LLM brain. Drop a component on a pawn, call one Blueprint node, and the character remembers the player across sessions instead of resetting every conversation. Choose where the brain runs — fully on-device, a cloud API, or your own server.
Requirements
- Unreal Engine 5.7
- Windows (Win64). On-Device mode uses Vulkan.
1. Install & enable
Installed from Fab, the plugin lives under Engine/Plugins/Fab/CometforgeNPC. Enable it in Edit → Plugins → AI → Cometforge NPC, then restart the editor.
2. Pick a brain
Project Settings → Plugins → Cometforge NPC
- On-Device (default) — works out of the box. The bundled Ministral 3 3B Instruct (Q4_K_M, ~2 GB, Apache-2.0) ships under the plugin’s
Content/Modelsand is found automatically. Press Play and talk to an NPC. To use a different model, set Model Path to any llama.cpp-compatible GGUF. One model serves the whole cast. - Cloud — set the base URL + API key of any OpenAI-compatible provider.
- BYOM — point the base URL at your own server (LM Studio, llama-server). No key needed.
| Mode | Model runs | Needs internet | Ships in the game |
|---|---|---|---|
| On-Device (bundled GGUF) | inside the game | No | Yes — model included |
| Cloud API | a remote server | Yes | No (connector only) |
| BYOM (your server) | your machine/server | LAN/local | No (connector only) |
3. Use it (Blueprint)
- Get the Cometforge Subsystem (a Game Instance Subsystem).
- Call Talk —
NpcId,System(the character prompt),Speaker,Text. It recalls that NPC’s memories, generates a reply, and records the exchange. With Auto Start on (default), you never call Load Model yourself. - Or drop the Cometforge Dialogue widget into your UI for a working chat box with zero node wiring — open it from any interact event with Open Dialogue.
Core nodes: Talk, Remember, Recall, Open Dialogue.
4. Memory & save games
Project Settings → Cometforge NPC → Memory Persistence
- Persistent (default) — one world memory, autosaved continuously. NPCs never forget, across sessions, reloads, and slots. Zero config.
- Save Game — memory follows your save slots. Call Save Memory To Slot beside your own Save Game to Slot node (same slot name), and Load Memory From Slot beside Load Game from Slot. Reload an earlier save and each NPC knows exactly what it knew then. Stored as
<SlotName>_CometforgeMemory.
Extra nodes: Reset All Memory (your “New Game” node), Delete Memory In Slot, and Get / Restore Memory Snapshot.
Known limitation
Core memory storage, persistence, and per-turn memory injection are deterministic and run with every model in every mode. Model-initiated tool use (an NPC deciding on its own to recall a memory or take an action) depends on the chosen model’s training — treat it as a bonus, not a guarantee, especially at small sizes. For Cloud/BYOM, a reasoning-class model performs best.
Third-party & licensing
- llama.cpp + ggml — MIT (
Source/ThirdParty/LlamaCpp) - Ministral 3 3B Instruct — Apache-2.0, © Mistral AI (
Content/Models, with NOTICE)
Support
Questions or studio partnerships: travis@tntholley.com