Skip to content

Wire Protocol

This page specifies the byte-level contract a third-party client must implement to talk to a Tryll server without using the shipped C++, Python, or Unreal client libraries. If you are using one of those libraries, you do not need any of this — skip to the how-to guides.

The authoritative source of truth for every structure named below is the FlatBuffers schema file server/schema/messages.fbs. This reference describes the transport around that schema.

Transport

  • TCP only. No TLS; the server is expected to run on localhost or inside a trusted network.
  • Bi-directional. Both client-to-server requests and server-to-client responses use the same framing over the same socket.
  • One connection = one session. Closing the socket destroys every agent owned by the session.

Framing

Every frame is a length-prefixed FlatBuffers payload:

+-------------------+----------------------------------+
|  length (4 bytes) |  FlatBuffers root table (N bytes)|
|  little-endian    |                                  |
|  uint32           |                                  |
+-------------------+----------------------------------+
  • length is the byte count of the FlatBuffers payload that follows, little-endian.
  • The payload's root type is Message (see messages.fbs); Message.body is a union whose variant tag selects the message kind.
  • Maximum frame size: 1 MiB (1 048 576 bytes). Any frame exceeding the cap is rejected with error 5003 FrameTooLarge (see error codes).

No compression, no chunking below the FlatBuffers level, no keep-alive heartbeats. The client is expected to read frames in a loop and dispatch them by the body union tag.

Message types

All messages are defined in messages.fbs under the MessageBody union. The full catalog (table names as they appear in the schema):

Direction Request Corresponding response(s)
S → C SessionReady (unsolicited, sent on accept)
C → S ConfigureSessionRequest ConfigureSessionResponse
C → S CreateAgentRequest CreateAgentResponse
C → S SendMessageRequest AnswerText × N, then TurnComplete
C → S DestroyAgentRequest Ack
C → S ListModelsRequest ListModelsResponse
C → S DownloadModelRequest DownloadProgress × N, then DownloadComplete
C → S LoadModelRequest LoadModelResponse
C → S UnloadModelRequest Ack
C → S CreateStringStorageRequest CreateStringStorageResponse
C → S DestroyStringStorageRequest Ack
C → S CreateEmbeddedStringStorageRequest CreateEmbeddedStringStorageResponse
C → S DestroyEmbeddedStringStorageRequest Ack
S → C ErrorResponse (replaces any expected response on failure)
S → C ToolCallNotification (unsolicited, fire-and-forget)

Session lifecycle

A connection follows this sequence:

sequenceDiagram
    participant C as Client
    participant S as Server
    C->>S: TCP connect
    S-->>C: SessionReady(protocol_version, session_id)
    C->>S: ConfigureSessionRequest(inference_engine, allow_auto_model_downloading?)
    S-->>C: ConfigureSessionResponse
    C->>S: CreateAgentRequest(graph, ...)
    note over S: if allow_auto_model_downloading=true and models absent:
    S-->>C: DownloadProgress(request_id=CreateAgent rid) × N
    S-->>C: CreateAgentResponse(agent_id)
    loop per turn
        C->>S: SendMessageRequest(agent_id, text)
        S-->>C: AnswerText(chunk, is_final=false)
        S-->>C: AnswerText(chunk, is_final=true)
        S-->>C: TurnComplete(status, debug_info?)
    end
    C->>S: DestroyAgentRequest(agent_id)
    S-->>C: Ack
    C->>S: TCP close

The server emits SessionReady before any request has been sent; a client that issues ConfigureSessionRequest without first reading SessionReady races against the schema-version check carried in that frame.

Important properties:

  1. Request IDs are optional. Every request may carry a client-assigned request_id; the server echoes it back on the matching response and on any ErrorResponse that replaces that response. Use it to correlate when multiple requests are in flight.
  2. Only one turn per agent at a time. Sending a second SendMessageRequest before TurnCompleteResponse arrives yields error 3004 AgentBusy.
  3. Multiple agents per session. A session may own many agents; turns on different agents run in parallel.
  4. Unsolicited frames. The server may emit ToolCallNotification or other unsolicited frames at any time. Clients must tolerate frames arriving between an in-flight request and its response.
  5. DownloadProgress under a CreateAgent request_id. When allow_auto_model_downloading=true and models need downloading, the server emits DownloadProgress frames using the CreateAgent request_id (informational only). The terminal frame is always CreateAgentResponse or ErrorResponse — never DownloadComplete.

Streaming answers

For every turn, the server emits one or more AnswerText frames followed by exactly one TurnComplete:

Field Meaning
agent_id Echoes the target agent.
text The text chunk.
is_delta true for delta chunks, false when text holds the accumulated response so far.
is_final true on the last chunk before TurnComplete. Exactly one AnswerText per turn has is_final = true; a turn that produces no text (e.g. routed to a canned response with empty output) still emits one final chunk with empty text.

TurnComplete carries:

Field Meaning
agent_id Echoes the target agent.
status TurnStatus::Success, Error, or Cancelled.
debug_info JSON string with per-node execution data; populated only when the agent was created with enable_diagnostics = true. Empty string otherwise.
tokens_generated Total tokens sampled across all generation nodes in this turn (prompt tokens excluded).

Error responses

When the server cannot satisfy a request, it emits an ErrorResponse instead of the expected response frame:

Field Meaning
request_id Echoes the request_id of the failed request, if any.
code Numeric error code.
message Human-readable description; safe for display.

Inference errors that occur during a turn are not delivered as standalone ErrorResponse frames — they surface via TurnComplete.status = Error, with detail in TurnComplete.debug_info when diagnostics are on. Session-level and protocol-level errors still come through ErrorResponse.

Refer to error codes for the full catalog and per-range recovery guidance.

Versioning

There is currently one wire-protocol version. When a breaking change lands, the server will reject incompatible ConfigureSessionRequest frames with error 5004 ProtocolVersionMismatch. Clients should surface the error message verbatim and stop reconnecting.

Writing a new client: checklist

  1. Open a TCP socket to the configured host/port.
  2. Read exactly 4 bytes; interpret as little-endian uint32 → length.
  3. Read exactly length bytes; decode as Message per messages.fbs.
  4. Dispatch on the body union tag.
  5. Read SessionReady first (unsolicited), then send ConfigureSessionRequest; block until ConfigureSessionResponse arrives.
  6. Serialise per-agent SendMessageRequests client-side; do not issue a second turn before TurnComplete.
  7. Treat any frame > 1 MiB as a fatal protocol error.
  8. Tolerate unsolicited frames (e.g. ToolCallNotification) interleaved with expected responses.