First Inference in C++¶
Build the C++ client library from source, run the bundled
test-chat demo, and see streamed tokens on stdout.
Before you start¶
- Visual Studio 2022 with the C++ workload, CMake 3.24+, and Git.
- At least one language model registered in the server's
models.json.
No need to start the server manually
tryll_test_chat discovers and starts tryll_server.exe automatically
from the build directory. If you prefer to manage the server yourself,
pass --no-managed-server when launching the demo.
Step 1 — build the tree¶
This builds the server, the C++ client library (tryll_client), and
the tryll_test_chat console demo. The default preset uses the
Visual Studio generator with Vulkan PREBUILT — no GPU SDK
required. For a CPU-only build add -DTRYLL_VULKAN=OFF.
Step 2 — run the demo¶
The demo automatically discovers tryll_server.exe from the build
directory, starts it on port 9100, waits for it to be ready, and then
connects. Pick a starting workflow with --workflow demo or
--workflow rag if you want something other than the default.
If you want to connect to a server you started yourself (or one on
another machine), pass --no-managed-server:
You should see:
Connected. Listing models…
[0] My Local Model (Local)
[1] Llama-3.2-3B-Instruct (Loaded)
Agent "chat" ready. Type /quit to exit.
You> hello
Assistant> Hi! How can I help you today?
You>
Each token of the assistant's reply appears on stdout as it is
produced — exactly what your own application will see through the
on_answer_text callback.
Step 3 — write your own minimal client¶
For a smaller starting point than test-chat, use RunAndConnect to
spawn the server and connect in a single call. The returned
ConnectedSession owns both the child process and the TCP session; its
destructor shuts them down in the correct order:
#include <filesystem>
#include <iostream>
#include <string_view>
#include <tryll/TryllClient.h>
int main()
{
namespace TC = Tryll::Client;
// Spawn the server and connect. Throws Tryll::TryllError on failure.
TC::ManagedServerOptions opts;
opts.exe = std::filesystem::path{"build/server/Debug/tryll_server.exe"};
opts.port = 9100;
auto session = Tryll::TryllClient::RunAndConnect(opts);
session.GetClient().ConfigureSession(TC::InferenceEngine::LlamaCpp);
// Build the simplest possible graph: one Generate node.
TC::GraphDescription graph;
graph.AddNode("answer", TC::NodeType::Generate)
.Wire("answer", "default", "END")
.SetStartNode("answer")
.SetDefaultModelName("My Local Model");
auto agent = session.GetClient().CreateAgent(graph);
// Stream tokens to stdout as they arrive.
agent.SendText("In one sentence: what is Tryll?",
[](std::string_view text, bool /*isDelta*/, bool /*isFinal*/)
{
std::cout << text << std::flush;
});
std::cout << "\n";
agent.Destroy();
// session destructor: shuts down the client, then stops the server
}
Already running a server?
If your server is managed separately (remote host, Docker, etc.) use
Connect directly — no ManagedServerOptions needed:
Link it against tryll_client and tryll_common (both already
built by the preset):
What you built¶
- A C++ client that connects, configures the session, creates a one-node agent, and streams a reply to stdout.
- The same surface as the bundled
test-chatdemo, which is a good reference when you are ready to add RAG, tool calls, or guardrails.
Where to go next¶
- How-to Guides — task recipes for RAG, tool calls, streaming to UI, model setup.
- Concepts — the mental model. Start with Architecture at a Glance.
- C++ Client API — the header / class surface.
Troubleshooting¶
FindSiblingServerExeerror on launch — the server exe was not found next to the test-chat exe. Rebuild withcmake --build --preset debugor pass--server-exe <path>to point directly at the server binary. Pass--no-managed-serverto skip auto-launch entirely and start the server yourself: see Run the Tryll Server.connection refusedwith--no-managed-server— the server is not running or is on a different port. Check the server log.- Linker errors about
FlatBuffers— make sure you are linking the generated target that was built alongsidetryll_client; the CMake preset handles this automatically. - Build picks Ninja accidentally — pass
--preset defaultexplicitly or add-G "Visual Studio 17 2022" -A x64to the configure command.