Skip to content

Getting Started

In about fifteen minutes you will run the Tryll server, connect a client, and stream a reply from a local language model. Pick the client that matches your project:

Pre-flight checklist

Before you start, have these ready:

  • A running Tryll server listening on 127.0.0.1:9100. The C++ demo (tryll_test_chat) and the Unreal plugin can start one for you automatically — see Auto-launch the Server. For Python, or to run the server manually, see Run the Tryll Server.
  • At least one language model available. The server reads its catalog from data/models.json. The tutorials pick a small default you can either download via DownloadModel or register by path — see Use Your Own Local Model.
  • Disk space. A small quantised chat model (3B-7B parameters) is typically 1.5–5 GB on disk plus similar RAM / VRAM to load.
  • A modern Windows machine. Linux and macOS work for the Python client but the pre-built server currently ships for Windows x64; other platforms require building from source.

What you will build

Each tutorial ends at the same milestone: the user sends one message, the server runs it through a minimal graph (just a Generate node), and the client prints the streamed reply. Everything else — RAG, tool calls, guardrails — builds on top of this pattern.

After the tutorial

The next step depends on what you want to build. Some good starting points in How-to Guides:

If you want the why before the how, start with Architecture at a Glance.