Skip to content

First Inference in C++

Build the C++ client library from source, run the bundled test-chat demo, and see streamed tokens on stdout.

Before you start

  • Visual Studio 2022 with the C++ workload, CMake 3.24+, and Git.
  • At least one language model registered in the server's models.json.

No need to start the server manually

tryll_test_chat discovers and starts tryll_server.exe automatically from the build directory. If you prefer to manage the server yourself, pass --no-managed-server when launching the demo.

Step 1 — build the tree

git clone <tryll-repo-url>
cd tryll\server
cmake --preset default
cmake --build --preset debug

This builds the server, the C++ client library (tryll_client), and the tryll_test_chat console demo. The default preset uses the Visual Studio generator with Vulkan PREBUILT — no GPU SDK required. For a CPU-only build add -DTRYLL_VULKAN=OFF.

Step 2 — run the demo

build\test-chat\Debug\tryll_test_chat.exe

The demo automatically discovers tryll_server.exe from the build directory, starts it on port 9100, waits for it to be ready, and then connects. Pick a starting workflow with --workflow demo or --workflow rag if you want something other than the default.

If you want to connect to a server you started yourself (or one on another machine), pass --no-managed-server:

build\test-chat\Debug\tryll_test_chat.exe --no-managed-server

You should see:

Connected. Listing models…
  [0] My Local Model (Local)
  [1] Llama-3.2-3B-Instruct (Loaded)
Agent "chat" ready. Type /quit to exit.

You> hello
Assistant> Hi! How can I help you today?
You>

Each token of the assistant's reply appears on stdout as it is produced — exactly what your own application will see through the on_answer_text callback.

Step 3 — write your own minimal client

For a smaller starting point than test-chat, use RunAndConnect to spawn the server and connect in a single call. The returned ConnectedSession owns both the child process and the TCP session; its destructor shuts them down in the correct order:

#include <filesystem>
#include <iostream>
#include <string_view>

#include <tryll/TryllClient.h>

int main()
{
    namespace TC = Tryll::Client;

    // Spawn the server and connect. Throws Tryll::TryllError on failure.
    TC::ManagedServerOptions opts;
    opts.exe  = std::filesystem::path{"build/server/Debug/tryll_server.exe"};
    opts.port = 9100;

    auto session = Tryll::TryllClient::RunAndConnect(opts);
    session.GetClient().ConfigureSession(TC::InferenceEngine::LlamaCpp);

    // Build the simplest possible graph: one Generate node.
    TC::GraphDescription graph;
    graph.AddNode("answer", TC::NodeType::Generate)
         .Wire("answer", "default", "END")
         .SetStartNode("answer")
         .SetDefaultModelName("My Local Model");

    auto agent = session.GetClient().CreateAgent(graph);

    // Stream tokens to stdout as they arrive.
    agent.SendText("In one sentence: what is Tryll?",
        [](std::string_view text, bool /*isDelta*/, bool /*isFinal*/)
        {
            std::cout << text << std::flush;
        });
    std::cout << "\n";

    agent.Destroy();
    // session destructor: shuts down the client, then stops the server
}

Already running a server?

If your server is managed separately (remote host, Docker, etc.) use Connect directly — no ManagedServerOptions needed:

auto client = Tryll::TryllClient::Connect("127.0.0.1", 9100);
client.ConfigureSession(TC::InferenceEngine::LlamaCpp);
// …
client.Shutdown();

Link it against tryll_client and tryll_common (both already built by the preset):

target_link_libraries(my_app PRIVATE tryll_client tryll_common)

What you built

  • A C++ client that connects, configures the session, creates a one-node agent, and streams a reply to stdout.
  • The same surface as the bundled test-chat demo, which is a good reference when you are ready to add RAG, tool calls, or guardrails.

Where to go next

Troubleshooting

  • FindSiblingServerExe error on launch — the server exe was not found next to the test-chat exe. Rebuild with cmake --build --preset debug or pass --server-exe <path> to point directly at the server binary. Pass --no-managed-server to skip auto-launch entirely and start the server yourself: see Run the Tryll Server.
  • connection refused with --no-managed-server — the server is not running or is on a different port. Check the server log.
  • Linker errors about FlatBuffers — make sure you are linking the generated target that was built alongside tryll_client; the CMake preset handles this automatically.
  • Build picks Ninja accidentally — pass --preset default explicitly or add -G "Visual Studio 17 2022" -A x64 to the configure command.