Skip to content

First Inference in Unreal

Unpack a Tryll release, drop the plugin and the server into your Unreal project, configure one project setting, and watch a single Generate node print its answer to the screen during PIE — no chat UI required.

What you'll build

A blank actor with a one-node workflow that, on BeginPlay, connects to a locally-spawned Tryll server, downloads a model on first run, runs one inference turn with the prompt "In one sentence: what is Tryll?", and prints the answer to the viewport and the Output Log.

Before you start

  • Unreal Engine 5.7 and a blank C++ project (Blueprint-only projects can't compile the plugin source).
  • ~5 GB free disk space for the auto-downloaded model and a working internet connection on first run.
  • Familiarity with creating a Blueprint actor and binding events.

You do not need a separate Tryll server install — the Unreal plugin spawns the server for you.


Step 1 — Get the distribution

  1. Download tryll-<version>.7z from the releases page and extract it somewhere temporary (e.g. C:\Downloads\tryll\).
  2. You'll get two sibling folders:

    tryll/
    ├── server/
    │   ├── tryll_server.exe
    │   ├── llama.dll, ggml*.dll, ggml-vulkan.dll
    │   └── data/
    │       ├── server-config.json
    │       ├── models.json
    │       ├── default-canned-responses.txt
    │       └── default-guardrail-patterns.txt
    └── UnrealPlugin/
        ├── TryllClient.uplugin
        ├── Generated/messages_generated.h
        └── Source/
    

The archive ships no model weights — the server lazy-downloads whatever the graph references, into .app-data/models/ next to the exe.

How the archive is built

ci/scripts/release.py builds the server (Stage A), stages the Unreal plugin source with FlatBuffers codegen (Stage B), builds the docs site (Stage C), and zips everything (Stage D). The server's data/server-config.json is patched with production overrides and models.json is filtered to the shipping audience.


Step 2 — Install the Unreal plugin

Copy the UnrealPlugin/ folder into your project as <YourProject>/Plugins/TryllClient/:

<YourProject>/
├── Plugins/
│   └── TryllClient/
│       ├── TryllClient.uplugin
│       ├── Generated/messages_generated.h
│       └── Source/TryllClient/{Public,Private}/
└── <YourProject>.uproject

Then:

  1. Right-click <YourProject>.uprojectGenerate Visual Studio project files.
  2. Open the solution and build in Development Editor.
  3. Launch the editor. Edit → Plugins → Project lists Tryll Client as enabled.

Step 3 — Place the server inside your project

The plugin's auto-launcher runs tryll_server.exe with CWD set to the exe's own folder, so wherever you put it, the data/ folder must sit next to it, and .app-data/models/ (gigabytes of GGUF weights) will be created in the same folder.

Recommended: copy server/ into the project as a top-level Tryll/ folder. It's project-relative, easy to gitignore, and not touched by Unreal cleans:

<YourProject>/
├── Plugins/TryllClient/
├── Tryll/                     <-- copy server/* here
│   ├── tryll_server.exe
│   ├── llama.dll, ggml*.dll, ggml-vulkan.dll
│   └── data/
└── <YourProject>.uproject

Add this to .gitignore so the multi-GB model cache stays out of git:

/Tryll/.app-data/

Other placement options

Option Path Notes
A (recommended) <YourProject>/Tryll/ Top-level, project-relative, not wiped by Unreal cleans.
B <YourProject>/ThirdParty/Tryll/ Same idea under a "third-party tools" convention.
C <YourProject>/Plugins/TryllClient/Server/ One folder to copy, but bloats the plugin with binaries and the model cache, and complicates packaging.
D <YourProject>/Binaries/Tryll/ Binaries/ is regenerated by UBT and often gitignored — your downloaded models can disappear on a clean.
E Outside the project (e.g. C:\tryll\) Shared across projects/machines, but each developer needs the same absolute path.

The rest of this tutorial assumes option A.


Step 4 — Configure Tryll Client in Project Settings

Open Edit → Project Settings → Plugins → Tryll Client and set:

Setting Value Why
Auto Launch Server true (default) Subsystem spawns the server on Initialize.
Editor Server Exe Path Tryll/tryll_server.exe Resolves relative to the project directory. Empty disables editor auto-launch.
Build Server Exe Path tryll_server.exe (default) Used in packaged builds, relative to the game exe.
Allow Auto Model Downloading true First-inference convenience: CreateAgent will download missing models for you.
Connect Max Attempts / Connect Retry Delay Seconds defaults Subsystem retries while the freshly-spawned server warms up.

Settings are persisted to Config/DefaultGame.ini under [/Script/TryllClient.TryllRuntimeSettings].

Allow Auto Model Downloading is a development flag

It removes friction for the first-inference loop, but it lets a CreateAgent call block for minutes while a model downloads. For shipped builds, drive downloads explicitly with Request Download Model and a progress UI. See Enable Auto Model Downloading.


Step 5 — Add the demo actor

  1. Create a blank Blueprint actor — call it BP_TryllDemoActor.
  2. Add a Tryll Agent Component to it. In the Details panel you'll see Workflow Asset, Inline Graph Description, Auto Create On Connect, and Enable Diagnostics.
  3. Leave Workflow Asset empty and expand Inline Graph Description:
    • Under Nodes, add one entry. Set Name = answer, Type = Generate. Leave Params empty.
    • Under Routes, add one entry: Source Node = answer, Exit Name = default, Target Node = END.
    • Set Start Node = answer.
    • Set Default Model Name to a catalog entry — spelling must match Tryll/data/models.json exactly. Recommended: Gemma 3 4B Instruct (Q4_K_M) (~3 GB download). Smaller / quicker: Llama 3.2 3B Instruct (Q4_K_M) (~2 GB download).
  4. Un-check Auto Create On Connect — this tutorial wires the lifecycle explicitly so we can call Configure Session before Create Agent. Auto-create fires on connect, which would race ahead of session configuration.
  5. Drop BP_TryllDemoActor into the level.

Step 6 — Wire the Blueprint

Open BP_TryllDemoActor's Event Graph.

flowchart LR
    BP[Event BeginPlay] --> GS[Get Tryll Subsystem]
    GS --> B1["Bind OnConnectionChanged"]
    B1 --> B2["Bind OnConfigureSessionComplete"]
    B2 --> B3["Bind OnAgentReady"]
    B3 --> B4["Bind OnAnswerText"]
    B4 --> B5["Bind OnTurnComplete"]
    B5 --> Conn[Connect]
    OCC["OnConnectionChanged<br>bConnected=true"] --> CS["Configure Session<br>Engine=LlamaCpp<br>bAllowAutoModelDownloading=true"]
    OCCS["OnConfigureSessionComplete<br>bSuccess=true"] --> CA["Create Agent"]
    OAR[OnAgentReady] --> SM["Send Message<br>In one sentence: what is Tryll?"]
    OAT[OnAnswerText] --> P1["Print String: Text"]
    OTC[OnTurnComplete] --> P2["Print String: Done."]

Or in text:

  1. On Event BeginPlay:
    • Get Game Instance → Get Subsystem (Tryll Subsystem). Store in a variable Tryll.
    • Bind On Connection Changed to HandleConnection.
    • Call Tryll → Connect.
  2. In HandleConnection(bConnected):
    • If bConnected = true, call Tryll → Configure Session with Engine = LlamaCpp and bAllowAutoModelDownloading = true.
    • Bind On Configure Session Complete to HandleConfigured.
  3. In HandleConfigured(bSuccess):
    • If bSuccess = true, call Create Agent on the TryllAgentComponent of this actor.
  4. On the component, bind On Agent Ready to a custom event that calls Send Message with "In one sentence: what is Tryll?".
  5. Bind On Answer Text (Text, bIsFinal) to Print String with Print to Screen = true, Print to Log = true, Duration = 10. Wire Text into the In String pin.
  6. Bind On Turn Complete (Status, DebugInfo, TokensGenerated) to another Print String that prints the literal "Done." (or use Enum to String on Status).

Why not just tick Auto Create On Connect?

Because it calls CreateAgent on the connection event, which fires before ConfigureSession has run. The server rejects the request with RequiresConfiguredSession. Manual wiring is the right pattern any time you need to call ConfigureSession with non-default arguments — like the auto-download flag.


Step 7 — Play

Press Play. On the first run you'll see:

  1. A few seconds while the server starts and the plugin connects.
  2. A long pause (minutes, depending on your network) while the model downloads — there's no progress bar in this minimal setup, but LogTryll reports DownloadProgress lines.
  3. The model loads from disk (a few more seconds the first time it's used in a session).
  4. The answer prints to the viewport and the Output Log:

    LogBlueprintUserMessages: [BP_TryllDemoActor] Tryll is a local small-language-model inference server.
    LogBlueprintUserMessages: [BP_TryllDemoActor] Done.
    

The first line is the single AnswerText frame for this turn (the Generate node defaults to non-streaming); the second is your OnTurnComplete handler. Set stream="true" on the node's Params for token-by-token output.

Subsequent runs skip the download and load steps — the model stays in the server process, and the .app-data/models/ cache survives across PIE sessions.


Where to see what's happening

Open Window → Output Log and use the Categories filter:

Category Source Useful for
LogTryllServer tryll_server.exe stdout (drained by the subsystem and re-emitted in Unreal) Listening on 0.0.0.0:9100, SessionReady, model load lines, server-side errors.
LogTryll Plugin (subsystem, agent component, connection thread) Connect attempts, ConfigureSession outcome, agent lifecycle, DownloadProgress.
LogBlueprintUserMessages Print String calls The answer text, your "Done." marker.

You don't need a separate console window for the server — everything funnels through the Unreal Output Log.

A rotating server log file is also written to <YourProject>/Tryll/.app-data/logs/tryll.log (production log mode is the default in the shipped server-config.json).


What you built

  • A session connected to a Tryll server that the plugin spawned for you.
  • One agent with a one-node graph.
  • Model output surfaced to Blueprint as OnAnswerText events (flip stream="true" on the node to get token-by-token chunks).

Everything the plugin exposes in Blueprint — connection, session, agents, models, string storages, tool calls — is listed in the Blueprint Catalog.

Where to go next

Troubleshooting

  • Tryll server exe not found at '...' in LogTryll/ErrorEditor Server Exe Path doesn't point at your copy. Remember it's resolved relative to the project directory; with the recommended layout it should be Tryll/tryll_server.exe.
  • OnConnectionChanged(false) fires repeatedly — the server crashed or didn't start. Check LogTryllServer for the failure reason and Tryll/.app-data/logs/tryll.log as a fallback.
  • OnConfigureSessionComplete(false) — usually a server-side error reading models.json or initialising the inference engine. Check LogTryllServer.
  • First CreateAgent hangs for minutes — that's the model downloading because Allow Auto Model Downloading is on. Watch LogTryll for DownloadProgress lines, or pre-download from a small UI using Request Download Model instead.
  • OnError fires during CreateAgent — read ErrorMessage. Common causes: Default Model Name doesn't match any entry in models.json, or a route targets a node name that doesn't exist.
  • CreateAgent takes several seconds even after the model is downloaded — normal on the first agent using a given model in a session; the server is loading it from disk into RAM/VRAM. Subsequent agents on the same model are instant.
  • Nothing prints after Send Message — bind both On Answer Text and On Turn Complete. With the default stream="false", OnAnswerText fires exactly once per turn with the full reply, immediately followed by OnTurnComplete.