Python Client API Reference¶
Full API reference for the tryll_client Python package, auto-generated from
Google-style docstrings. The package lives at server/client-python/tryll_client/.
The public surface — re-exported from tryll_client.__init__ — is:
TryllClient, ConnectedSession, AgentProxy, TryllError,
GraphDescription, InferenceEngine, NodeType, ManagedServer, and the
knowledge-presentation config types.
ConnectedSession is the recommended entry point for local-server deployments;
obtain it via TryllClient.run_and_connect.
Internal modules (wire, codec, _generated) are excluded.
tryll_client¶
tryll_client ¶
tryll_client — Python client library for the Tryll server.
Quick start::
from tryll_client import TryllClient, InferenceEngine, GraphDescription
from tryll_client.graph import GenerateParams, SamplingOverrides
client = TryllClient.connect("127.0.0.1", 9100)
client.configure_session(InferenceEngine.LlamaCpp)
graph = (GraphDescription()
.add_generate("generate", GenerateParams(
system_prompt="You are a helpful assistant.",
stream=True,
sampling=SamplingOverrides(temperature=0.0, seed=42),
# default_exit defaults to "" = END
))
.set_start_node("generate")
.set_default_model_name("Llama 3.2 3B Instruct (Q4_K_M)"))
agent = client.create_agent(graph)
response = agent.send_message("Hello!")
agent.destroy()
client.shutdown()
Prerequisites
The CMake build for server/ must have run at least once to generate FlatBuffers Python code into tryll_client/_generated/. Run: cmake --build server/build
AgentProxy ¶
Handle for one server-side agent created via TryllClient.create_agent().
The proxy exposes the user-facing turn API — :meth:send_message and
:meth:destroy — and caches diagnostics from the last completed turn
on last_* properties.
Bind the proxy to its owning client and server-side agent id.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client
|
'TryllClient'
|
Parent :class: |
required |
agent_id
|
int
|
Server-assigned agent identifier returned in the
|
required |
node_baselines
|
dict[str, 'NodeParamsBase'] | None
|
Deep-copied params per graph node name, captured
at |
None
|
last_debug_info
property
¶
JSON diagnostics string from the most recent send_message call.
Returns:
| Type | Description |
|---|---|
str
|
Server-produced JSON document attached to |
str
|
when diagnostics were enabled on agent creation; otherwise an |
str
|
empty string. Empty string also before the first turn. |
last_ttft_s
property
¶
Time-to-first-token in seconds for the most recent turn.
Returns:
| Type | Description |
|---|---|
float | None
|
Seconds elapsed from |
float | None
|
streamed |
float | None
|
arrived (e.g. canned-response paths that skip streaming). |
last_answer_chunk_count
property
¶
Number of AnswerText chunks received for the last turn.
Typically one chunk per generated token when streaming; useful for verifying the stream was delivered incrementally.
last_tokens_generated
property
¶
Server-reported generated token count for the last turn.
Authoritative for both streaming and non-streaming modes. Zero if the server has not yet completed a turn.
set_on_answer_text ¶
Register (or clear) a persistent callback for streaming text chunks.
The callback is invoked on the reader thread for every AnswerText
frame received for this agent — including frames from server-initiated
turns (e.g. voice autosend after VoiceInput.end_utterance).
The callback signature is (text: str, is_delta: bool, is_final: bool).
Must return quickly and must not call blocking client methods.
Call with None to unregister the current callback.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cb
|
Callable[[str, bool, bool], None] | None
|
Callable invoked with |
required |
set_on_turn_complete ¶
Register (or clear) a persistent callback for turn-complete notifications.
The callback is invoked on the reader thread once per TurnComplete
frame for this agent. Covers both client-initiated turns (via
:meth:send_message) and server-initiated turns (voice autosend).
The callback signature is
(status: int, debug_info: str, tokens_generated: int)
where status maps to the wire TurnStatus enum
(0 = Ok, 1 = Error, 2 = Cancelled).
Call with None to unregister the current callback.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cb
|
Callable | None
|
Callable invoked with |
required |
set_on_tool_call ¶
Register (or clear) a callback for tool-call notifications.
The callback is invoked on the reader thread for every
NodeEvent frame with event_type == "tool_call" the server
sends for this agent — i.e. when the graph has a ToolCall node
with notify_client = "true". It must return quickly and must
not call any blocking :class:TryllClient or :class:AgentProxy
methods.
The callback signature is (tool_name: str, arguments_json: str)
where arguments_json is a compact JSON object, e.g.
'{"city": "Berlin"}'. Parse it with :func:json.loads as
needed.
Call with None to unregister the current callback.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cb
|
Callable[[str, str], None] | None
|
Callable invoked with |
required |
set_on_intent_classified ¶
Register (or clear) a callback for intent-classification notifications.
Fired on the reader thread for every NodeEvent with
event_type == "intent_classified" — emitted by
ClassifyIntentNode on its "found" path when
notify_client = "true".
The callback signature is
(intent: str, record_id: str, record_index: int, distance: float).
Call with None to unregister.
set_on_node_event ¶
Register (or clear) a generic NodeEvent fallback callback.
Fired on the reader thread for any NodeEvent whose
event_type is unrecognised or whose typed callback (e.g.
:meth:set_on_tool_call, :meth:set_on_intent_classified) is
not registered. When a typed callback handles the event, this
one is NOT invoked for that event.
The callback signature is
(node_name: str, event_type: str, kv_pairs: list[tuple[str, str]]).
Call with None to unregister.
send_message ¶
Send a user message and return the complete assistant response.
Blocks until TurnComplete is received from the server,
accumulating all AnswerText chunks into a single string.
Diagnostics from TurnComplete (debug_info,
tokens_generated) are cached on the last_* properties.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
User-turn text to send to the agent. |
required |
timeout
|
float
|
Maximum seconds to wait for |
120.0
|
Returns:
| Type | Description |
|---|---|
str
|
The full concatenated assistant response text. |
Raises:
| Type | Description |
|---|---|
TryllError
|
On server-reported errors, decode failures, or
if |
change_param ¶
Apply a single-field mutation using a dotted attribute path.
Clones the baseline params captured at create_agent, coerces
value to the leaf field type (when value is a string), assigns
via :func:~tryll_client._generated.node_params_codec.set_dotted,
and sends the full params object. On success the baseline is updated
so later mutations compose.
Example::
agent.change_param("generate", "sampling.temperature", "0.5")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node_name
|
str
|
Instance name of the target node. |
required |
path
|
str
|
Dotted path to the field (e.g. |
required |
value
|
Any
|
New value (string from JSON/dialog scripts, or a native Python value). |
required |
timeout
|
float
|
Maximum seconds to wait for the server |
30.0
|
Raises:
| Type | Description |
|---|---|
TryllError
|
If node_name is unknown (code 3005) or the server reports another error. |
change_params ¶
Apply a typed node-parameter update to a workflow node at runtime.
The typed params object must match the concrete type of the target node.
Structural fields must match the create-time values or the server
returns ParamNotMutable. On success, updates the internal baseline
used by :meth:change_param.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node_name
|
str
|
Instance name of the target node. |
required |
params
|
'NodeParamsBase'
|
Typed node parameters (generated DTO). |
required |
timeout
|
float
|
Maximum seconds to wait for the server |
30.0
|
Raises:
| Type | Description |
|---|---|
TryllError
|
On server-reported errors ( |
destroy ¶
Request agent destruction on the server and wait for Ack.
After this call returns, the proxy is no longer usable; further
send_message calls will raise :class:TryllError.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
timeout
|
float
|
Maximum seconds to wait for the |
30.0
|
Raises:
| Type | Description |
|---|---|
TryllError
|
On server-reported errors or timeout. |
ConnectedSession ¶
RAII wrapper that owns a :class:~tryll_client.managed_server.ManagedServer
and a connected :class:TryllClient.
Obtain via :meth:TryllClient.run_and_connect. Use as a context manager
(recommended) to guarantee teardown in the right order — client first, then
server::
with TryllClient.run_and_connect(exe=Path("..."), port=9100) as session:
session.client.configure_session(InferenceEngine.LlamaCpp)
agent = session.client.create_agent(graph)
print(agent.send_message("hi"))
TryllClient ¶
Synchronous TCP session to the Tryll server.
Usage::
client = TryllClient.connect("127.0.0.1", 9100)
client.configure_session(InferenceEngine.LlamaCpp)
graph = GraphDescription().add_node(...).set_start_node(...)
# Exit fields (e.g. default_exit="") live on each node's typed params
agent = client.create_agent(graph)
response = agent.send_message("Hello")
agent.destroy()
client.shutdown()
connect
classmethod
¶
Connect to the Tryll server and wait for SessionReady.
run_and_connect
classmethod
¶
run_and_connect(*, exe, host='127.0.0.1', port=9100, cwd=None, extra_args=(), stdout=None, stderr=None, start_timeout=30.0, stop_timeout=8.0, connect_timeout=30.0)
Spawn tryll_server and connect — recommended one-call factory.
Starts a local server process (passing --port <port> on the command
line), waits for its TCP port to be ready, then opens a session.
The returned :class:ConnectedSession owns both the server process and
the client; use it as a context manager to guarantee clean teardown::
with TryllClient.run_and_connect(exe=Path("..."), port=9100) as session:
session.client.configure_session(InferenceEngine.LlamaCpp)
agent = session.client.create_agent(graph)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
exe
|
Path | str
|
Path to |
required |
host
|
str
|
Host for TCP probe and :meth: |
'127.0.0.1'
|
port
|
int
|
Port passed as |
9100
|
cwd
|
Path | None
|
Working directory for the child process (defaults to
|
None
|
extra_args
|
Sequence[str]
|
Additional CLI arguments after |
()
|
stdout
|
Path | None
|
File path for child stdout ( |
None
|
stderr
|
Path | None
|
File path for child stderr ( |
None
|
start_timeout
|
float
|
Seconds to wait for the TCP port to open (default 30). |
30.0
|
stop_timeout
|
float
|
Seconds to wait for graceful server exit (default 8). |
8.0
|
connect_timeout
|
float
|
Seconds to wait for :class: |
30.0
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
'ConnectedSession'
|
class: |
'ConnectedSession'
|
connected. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If exe does not exist. |
TimeoutError
|
If the port does not open within start_timeout. |
TryllError
|
If the session handshake fails. |
configure_session ¶
configure_session(engine, allow_auto_model_downloading=False, game_name='', timeout=30.0, stt_engine=0, tts_engine=0, embedding_engine=0)
Send ConfigureSessionRequest and wait for response.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
engine
|
InferenceEngine | int
|
Inference backend for language models. |
required |
allow_auto_model_downloading
|
bool
|
When True, :meth: |
False
|
game_name
|
str
|
Integration identifier for telemetry grouping (e.g. |
''
|
timeout
|
float
|
Maximum seconds to wait for the server response. |
30.0
|
stt_engine
|
InferenceEngine | int
|
Inference backend for STT (speech-to-text) models.
Defaults to :attr: |
0
|
tts_engine
|
InferenceEngine | int
|
Inference backend for TTS (text-to-speech) models.
Defaults to :attr: |
0
|
embedding_engine
|
InferenceEngine | int
|
Inference backend for embedding models.
Defaults to :attr: |
0
|
create_string_storage ¶
Create a named StringStorage on the server.
Provide either strings (inline list) or file_path (server-side file path).
The storage can then be referenced by name in node params via string_storage.
create_keyed_string_storage ¶
Create a Map or Multimap-kind StringStorage on the server.
keys and values must be the same length.
For Map kind all keys must be unique.
The storage can then be referenced by name in IntentToInstructionNode
(and other keyed-consumer nodes) via the string_storage param.
destroy_string_storage ¶
Destroy a named StringStorage on the server.
Nodes that already hold the storage are unaffected.
create_embedded_string_storage ¶
create_embedded_string_storage(name, config_path=None, strings=None, embedding_model=None, timeout=None)
Create a named EmbeddedStringStorage on the server.
Path A: supply config_path (server-side *.json).
Path B: supply strings (inline list) + embedding_model.
Returns an EmbeddedStorageInfo with record_count and embedding_dim.
When allow_auto_model_downloading=True was passed to
:meth:configure_session the server may download the embedding model
before building the storage. The default timeout is bumped to 30
minutes in that case; otherwise it is 120 seconds.
destroy_embedded_string_storage ¶
Destroy a named EmbeddedStringStorage on the server.
Nodes that already hold the storage are unaffected.
create_voice_input ¶
create_voice_input(model_name, sample_rate=16000, channels=1, bits_per_sample=16, vad_threshold=0.5, vad_min_silence_ms=500, vad_speech_pad_ms=250, hotwords_storage_name='', hotwords_score=1.5, timeout=30.0)
Create a server-side VoiceInput session for speech-to-text.
Returns the opaque voice_input_id used by all subsequent voice operations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Catalog name of the STT model, e.g.
|
required |
sample_rate
|
int
|
Sample rate of audio buffers you will push (Hz). |
16000
|
channels
|
int
|
Channel count (1 = mono, typical for mic capture). |
1
|
bits_per_sample
|
int
|
Bit depth of raw PCM samples (only 16-bit is supported). |
16
|
vad_threshold
|
float
|
Silero VAD speech-probability threshold (0.0–1.0). |
0.5
|
vad_min_silence_ms
|
int
|
Silence duration (ms) that closes a speech segment. |
500
|
vad_speech_pad_ms
|
int
|
Padding (ms) added around detected speech. |
250
|
hotwords_storage_name
|
str
|
Name of a previously created |
''
|
hotwords_score
|
float
|
Per-token bias strength applied to every phrase in the storage. 1.5 is a gentle default; 2.5 is aggressive. |
1.5
|
timeout
|
float
|
Maximum seconds to wait for the server response. |
30.0
|
Raises:
| Type | Description |
|---|---|
TryllError
|
If the model cannot be found, the hotwords storage name is unknown, or the request times out. |
create_agent ¶
Create a server-side agent and return a proxy handle.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
GraphDescription
|
Fully-built graph description. |
required |
enable_diagnostics
|
bool
|
When True the server serialises per-node execution data into TurnComplete.debug_info for every turn. |
False
|
timeout
|
float | None
|
Maximum seconds to wait for the response. Defaults to 30 s
normally, or 30 minutes when :meth: |
None
|
list_models ¶
Request all models known to the server for the session's engine.
load_model ¶
Explicitly load and pin a model into memory.
The model stays in memory until :meth:unload_model is called,
regardless of whether any agents are using it.
Raises :class:TryllError if the model cannot be resolved or loaded.
unload_model ¶
Unpin a previously pinned model.
If no agents are currently using the model it is freed immediately; otherwise it will be freed when the last agent using it is destroyed.
download_model ¶
Initiate a model download on the server and block until complete.
on_progress is called with (model_name, bytes_downloaded, total_bytes, percent) for each progress frame received. Raises TryllError on failure.
TryllError ¶
Bases: Exception
Raised on server errors, protocol errors, or timeouts.
Attributes:
| Name | Type | Description |
|---|---|---|
code |
Numeric error code from the server's |
Initialise a Tryll client error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
str
|
Human-readable description of the failure, typically
copied from the server's |
required |
code
|
int
|
Numeric error code from |
0
|
GraphDescription ¶
Fluent builder for a workflow graph sent to the server.
Typical usage::
graph = (GraphDescription()
.add_generate("gen", GenerateParams(
model_name="qwen2.5-0.5b-instruct",
system_prompt="You are helpful."))
.set_start_node("gen"))
agent = await client.create_agent(graph)
InferenceEngine ¶
Bases: IntEnum
Inference backend. Mirrors the FlatBuffers InferenceEngine enum.
ModelInfo
dataclass
¶
Summary information for one model returned by list_models().
ModelStatus ¶
Bases: IntEnum
Status of a model as reported by the server.
NodeType ¶
Bases: IntEnum
Node type ordinal. Mirrors the FlatBuffers NodeType enum in messages.fbs.
The union discriminant in NodeParams supersedes this for the wire format;
NodeType is kept for client-side ergonomics and generated convenience APIs.
ManagedServer ¶
ManagedServer(*, exe, host='127.0.0.1', port=9100, cwd=None, extra_args=(), stdout=None, stderr=None, start_timeout=30.0, stop_timeout=8.0)
RAII handle around a child tryll_server process.
Use as a context manager (recommended), or call :meth:stop explicitly::
with ManagedServer.start(exe=Path("..."), port=9100) as srv:
client = TryllClient.connect(srv.host, srv.port)
...
No executable discovery is performed — pass the path explicitly.
The server is always launched with --port <port> so the caller's port
takes precedence over whatever is in server-config.json.
start
classmethod
¶
start(*, exe, host='127.0.0.1', port=9100, cwd=None, extra_args=(), stdout=None, stderr=None, start_timeout=30.0, stop_timeout=8.0)
Spawn the server and block until its TCP port is ready.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
exe
|
Path | str
|
Path to |
required |
host
|
str
|
Host string for the TCP ready-probe (default |
'127.0.0.1'
|
port
|
int
|
TCP port; passed as |
9100
|
cwd
|
Path | None
|
Working directory for the child process (defaults to
|
None
|
extra_args
|
Sequence[str]
|
Additional CLI arguments appended after |
()
|
stdout
|
Path | None
|
File path for child stdout redirection ( |
None
|
stderr
|
Path | None
|
File path for child stderr redirection ( |
None
|
start_timeout
|
float
|
Seconds to wait for the TCP port to open (default 30). |
30.0
|
stop_timeout
|
float
|
Seconds to wait for clean exit on :meth: |
8.0
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
'ManagedServer'
|
class: |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If exe does not exist. |
TimeoutError
|
If the port does not open within start_timeout. |
OSError
|
If the process cannot be spawned. |
stop ¶
Terminate the child process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
timeout
|
float | None
|
Seconds to wait before force-killing (defaults to the stop_timeout supplied at construction). |
None
|
tryll_client.client¶
tryll_client.client ¶
TryllClient — synchronous TCP client for the Tryll server.
Threading model
- A background reader thread continuously receives frames and dispatches them to registered pending requests via threading.Event.
- The main (calling) thread sends requests and blocks on the Event.
- All access to _pending is protected by _lock.
TryllClient ¶
Synchronous TCP session to the Tryll server.
Usage::
client = TryllClient.connect("127.0.0.1", 9100)
client.configure_session(InferenceEngine.LlamaCpp)
graph = GraphDescription().add_node(...).set_start_node(...)
# Exit fields (e.g. default_exit="") live on each node's typed params
agent = client.create_agent(graph)
response = agent.send_message("Hello")
agent.destroy()
client.shutdown()
connect
classmethod
¶
Connect to the Tryll server and wait for SessionReady.
run_and_connect
classmethod
¶
run_and_connect(*, exe, host='127.0.0.1', port=9100, cwd=None, extra_args=(), stdout=None, stderr=None, start_timeout=30.0, stop_timeout=8.0, connect_timeout=30.0)
Spawn tryll_server and connect — recommended one-call factory.
Starts a local server process (passing --port <port> on the command
line), waits for its TCP port to be ready, then opens a session.
The returned :class:ConnectedSession owns both the server process and
the client; use it as a context manager to guarantee clean teardown::
with TryllClient.run_and_connect(exe=Path("..."), port=9100) as session:
session.client.configure_session(InferenceEngine.LlamaCpp)
agent = session.client.create_agent(graph)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
exe
|
Path | str
|
Path to |
required |
host
|
str
|
Host for TCP probe and :meth: |
'127.0.0.1'
|
port
|
int
|
Port passed as |
9100
|
cwd
|
Path | None
|
Working directory for the child process (defaults to
|
None
|
extra_args
|
Sequence[str]
|
Additional CLI arguments after |
()
|
stdout
|
Path | None
|
File path for child stdout ( |
None
|
stderr
|
Path | None
|
File path for child stderr ( |
None
|
start_timeout
|
float
|
Seconds to wait for the TCP port to open (default 30). |
30.0
|
stop_timeout
|
float
|
Seconds to wait for graceful server exit (default 8). |
8.0
|
connect_timeout
|
float
|
Seconds to wait for :class: |
30.0
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
'ConnectedSession'
|
class: |
'ConnectedSession'
|
connected. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If exe does not exist. |
TimeoutError
|
If the port does not open within start_timeout. |
TryllError
|
If the session handshake fails. |
configure_session ¶
configure_session(engine, allow_auto_model_downloading=False, game_name='', timeout=30.0, stt_engine=0, tts_engine=0, embedding_engine=0)
Send ConfigureSessionRequest and wait for response.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
engine
|
InferenceEngine | int
|
Inference backend for language models. |
required |
allow_auto_model_downloading
|
bool
|
When True, :meth: |
False
|
game_name
|
str
|
Integration identifier for telemetry grouping (e.g. |
''
|
timeout
|
float
|
Maximum seconds to wait for the server response. |
30.0
|
stt_engine
|
InferenceEngine | int
|
Inference backend for STT (speech-to-text) models.
Defaults to :attr: |
0
|
tts_engine
|
InferenceEngine | int
|
Inference backend for TTS (text-to-speech) models.
Defaults to :attr: |
0
|
embedding_engine
|
InferenceEngine | int
|
Inference backend for embedding models.
Defaults to :attr: |
0
|
create_string_storage ¶
Create a named StringStorage on the server.
Provide either strings (inline list) or file_path (server-side file path).
The storage can then be referenced by name in node params via string_storage.
create_keyed_string_storage ¶
Create a Map or Multimap-kind StringStorage on the server.
keys and values must be the same length.
For Map kind all keys must be unique.
The storage can then be referenced by name in IntentToInstructionNode
(and other keyed-consumer nodes) via the string_storage param.
destroy_string_storage ¶
Destroy a named StringStorage on the server.
Nodes that already hold the storage are unaffected.
create_embedded_string_storage ¶
create_embedded_string_storage(name, config_path=None, strings=None, embedding_model=None, timeout=None)
Create a named EmbeddedStringStorage on the server.
Path A: supply config_path (server-side *.json).
Path B: supply strings (inline list) + embedding_model.
Returns an EmbeddedStorageInfo with record_count and embedding_dim.
When allow_auto_model_downloading=True was passed to
:meth:configure_session the server may download the embedding model
before building the storage. The default timeout is bumped to 30
minutes in that case; otherwise it is 120 seconds.
destroy_embedded_string_storage ¶
Destroy a named EmbeddedStringStorage on the server.
Nodes that already hold the storage are unaffected.
create_voice_input ¶
create_voice_input(model_name, sample_rate=16000, channels=1, bits_per_sample=16, vad_threshold=0.5, vad_min_silence_ms=500, vad_speech_pad_ms=250, hotwords_storage_name='', hotwords_score=1.5, timeout=30.0)
Create a server-side VoiceInput session for speech-to-text.
Returns the opaque voice_input_id used by all subsequent voice operations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Catalog name of the STT model, e.g.
|
required |
sample_rate
|
int
|
Sample rate of audio buffers you will push (Hz). |
16000
|
channels
|
int
|
Channel count (1 = mono, typical for mic capture). |
1
|
bits_per_sample
|
int
|
Bit depth of raw PCM samples (only 16-bit is supported). |
16
|
vad_threshold
|
float
|
Silero VAD speech-probability threshold (0.0–1.0). |
0.5
|
vad_min_silence_ms
|
int
|
Silence duration (ms) that closes a speech segment. |
500
|
vad_speech_pad_ms
|
int
|
Padding (ms) added around detected speech. |
250
|
hotwords_storage_name
|
str
|
Name of a previously created |
''
|
hotwords_score
|
float
|
Per-token bias strength applied to every phrase in the storage. 1.5 is a gentle default; 2.5 is aggressive. |
1.5
|
timeout
|
float
|
Maximum seconds to wait for the server response. |
30.0
|
Raises:
| Type | Description |
|---|---|
TryllError
|
If the model cannot be found, the hotwords storage name is unknown, or the request times out. |
create_agent ¶
Create a server-side agent and return a proxy handle.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
GraphDescription
|
Fully-built graph description. |
required |
enable_diagnostics
|
bool
|
When True the server serialises per-node execution data into TurnComplete.debug_info for every turn. |
False
|
timeout
|
float | None
|
Maximum seconds to wait for the response. Defaults to 30 s
normally, or 30 minutes when :meth: |
None
|
list_models ¶
Request all models known to the server for the session's engine.
load_model ¶
Explicitly load and pin a model into memory.
The model stays in memory until :meth:unload_model is called,
regardless of whether any agents are using it.
Raises :class:TryllError if the model cannot be resolved or loaded.
unload_model ¶
Unpin a previously pinned model.
If no agents are currently using the model it is freed immediately; otherwise it will be freed when the last agent using it is destroyed.
download_model ¶
Initiate a model download on the server and block until complete.
on_progress is called with (model_name, bytes_downloaded, total_bytes, percent) for each progress frame received. Raises TryllError on failure.
ConnectedSession ¶
RAII wrapper that owns a :class:~tryll_client.managed_server.ManagedServer
and a connected :class:TryllClient.
Obtain via :meth:TryllClient.run_and_connect. Use as a context manager
(recommended) to guarantee teardown in the right order — client first, then
server::
with TryllClient.run_and_connect(exe=Path("..."), port=9100) as session:
session.client.configure_session(InferenceEngine.LlamaCpp)
agent = session.client.create_agent(graph)
print(agent.send_message("hi"))
tryll_client.agent¶
tryll_client.agent ¶
AgentProxy — client-side handle for a server-side agent.
An :class:AgentProxy is returned by
:meth:tryll_client.TryllClient.create_agent and represents a single
server-side agent running a pre-built workflow graph. The proxy is
single-session and not thread-safe beyond what the parent
:class:TryllClient provides.
AgentProxy ¶
Handle for one server-side agent created via TryllClient.create_agent().
The proxy exposes the user-facing turn API — :meth:send_message and
:meth:destroy — and caches diagnostics from the last completed turn
on last_* properties.
Bind the proxy to its owning client and server-side agent id.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client
|
'TryllClient'
|
Parent :class: |
required |
agent_id
|
int
|
Server-assigned agent identifier returned in the
|
required |
node_baselines
|
dict[str, 'NodeParamsBase'] | None
|
Deep-copied params per graph node name, captured
at |
None
|
last_debug_info
property
¶
JSON diagnostics string from the most recent send_message call.
Returns:
| Type | Description |
|---|---|
str
|
Server-produced JSON document attached to |
str
|
when diagnostics were enabled on agent creation; otherwise an |
str
|
empty string. Empty string also before the first turn. |
last_ttft_s
property
¶
Time-to-first-token in seconds for the most recent turn.
Returns:
| Type | Description |
|---|---|
float | None
|
Seconds elapsed from |
float | None
|
streamed |
float | None
|
arrived (e.g. canned-response paths that skip streaming). |
last_answer_chunk_count
property
¶
Number of AnswerText chunks received for the last turn.
Typically one chunk per generated token when streaming; useful for verifying the stream was delivered incrementally.
last_tokens_generated
property
¶
Server-reported generated token count for the last turn.
Authoritative for both streaming and non-streaming modes. Zero if the server has not yet completed a turn.
set_on_answer_text ¶
Register (or clear) a persistent callback for streaming text chunks.
The callback is invoked on the reader thread for every AnswerText
frame received for this agent — including frames from server-initiated
turns (e.g. voice autosend after VoiceInput.end_utterance).
The callback signature is (text: str, is_delta: bool, is_final: bool).
Must return quickly and must not call blocking client methods.
Call with None to unregister the current callback.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cb
|
Callable[[str, bool, bool], None] | None
|
Callable invoked with |
required |
set_on_turn_complete ¶
Register (or clear) a persistent callback for turn-complete notifications.
The callback is invoked on the reader thread once per TurnComplete
frame for this agent. Covers both client-initiated turns (via
:meth:send_message) and server-initiated turns (voice autosend).
The callback signature is
(status: int, debug_info: str, tokens_generated: int)
where status maps to the wire TurnStatus enum
(0 = Ok, 1 = Error, 2 = Cancelled).
Call with None to unregister the current callback.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cb
|
Callable | None
|
Callable invoked with |
required |
set_on_tool_call ¶
Register (or clear) a callback for tool-call notifications.
The callback is invoked on the reader thread for every
NodeEvent frame with event_type == "tool_call" the server
sends for this agent — i.e. when the graph has a ToolCall node
with notify_client = "true". It must return quickly and must
not call any blocking :class:TryllClient or :class:AgentProxy
methods.
The callback signature is (tool_name: str, arguments_json: str)
where arguments_json is a compact JSON object, e.g.
'{"city": "Berlin"}'. Parse it with :func:json.loads as
needed.
Call with None to unregister the current callback.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cb
|
Callable[[str, str], None] | None
|
Callable invoked with |
required |
set_on_intent_classified ¶
Register (or clear) a callback for intent-classification notifications.
Fired on the reader thread for every NodeEvent with
event_type == "intent_classified" — emitted by
ClassifyIntentNode on its "found" path when
notify_client = "true".
The callback signature is
(intent: str, record_id: str, record_index: int, distance: float).
Call with None to unregister.
set_on_node_event ¶
Register (or clear) a generic NodeEvent fallback callback.
Fired on the reader thread for any NodeEvent whose
event_type is unrecognised or whose typed callback (e.g.
:meth:set_on_tool_call, :meth:set_on_intent_classified) is
not registered. When a typed callback handles the event, this
one is NOT invoked for that event.
The callback signature is
(node_name: str, event_type: str, kv_pairs: list[tuple[str, str]]).
Call with None to unregister.
send_message ¶
Send a user message and return the complete assistant response.
Blocks until TurnComplete is received from the server,
accumulating all AnswerText chunks into a single string.
Diagnostics from TurnComplete (debug_info,
tokens_generated) are cached on the last_* properties.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
User-turn text to send to the agent. |
required |
timeout
|
float
|
Maximum seconds to wait for |
120.0
|
Returns:
| Type | Description |
|---|---|
str
|
The full concatenated assistant response text. |
Raises:
| Type | Description |
|---|---|
TryllError
|
On server-reported errors, decode failures, or
if |
change_param ¶
Apply a single-field mutation using a dotted attribute path.
Clones the baseline params captured at create_agent, coerces
value to the leaf field type (when value is a string), assigns
via :func:~tryll_client._generated.node_params_codec.set_dotted,
and sends the full params object. On success the baseline is updated
so later mutations compose.
Example::
agent.change_param("generate", "sampling.temperature", "0.5")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node_name
|
str
|
Instance name of the target node. |
required |
path
|
str
|
Dotted path to the field (e.g. |
required |
value
|
Any
|
New value (string from JSON/dialog scripts, or a native Python value). |
required |
timeout
|
float
|
Maximum seconds to wait for the server |
30.0
|
Raises:
| Type | Description |
|---|---|
TryllError
|
If node_name is unknown (code 3005) or the server reports another error. |
change_params ¶
Apply a typed node-parameter update to a workflow node at runtime.
The typed params object must match the concrete type of the target node.
Structural fields must match the create-time values or the server
returns ParamNotMutable. On success, updates the internal baseline
used by :meth:change_param.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node_name
|
str
|
Instance name of the target node. |
required |
params
|
'NodeParamsBase'
|
Typed node parameters (generated DTO). |
required |
timeout
|
float
|
Maximum seconds to wait for the server |
30.0
|
Raises:
| Type | Description |
|---|---|
TryllError
|
On server-reported errors ( |
destroy ¶
Request agent destruction on the server and wait for Ack.
After this call returns, the proxy is no longer usable; further
send_message calls will raise :class:TryllError.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
timeout
|
float
|
Maximum seconds to wait for the |
30.0
|
Raises:
| Type | Description |
|---|---|
TryllError
|
On server-reported errors or timeout. |
tryll_client.graph¶
tryll_client.graph ¶
GraphDescription builder and associated enums / dataclasses.
Mirrors the C++ Tryll::Client::GraphDescription / InferenceEngine
/ NodeType types. Enum values must stay in sync with
server/schema/messages.fbs.
This module exposes:
- :class:
GraphDescription— fluent builder for a workflow graph sent to the server onCreateAgent. - The enums used by graph descriptions: :class:
InferenceEngine, :class:ModelStatus. - The typed node-parameter DTOs from
tryll_client._generated.node_params.
NodeParamsBase ¶
Marker base for generated node-parameter DTOs.
NodeType ¶
Bases: IntEnum
Node type ordinal. Mirrors the FlatBuffers NodeType enum in messages.fbs.
The union discriminant in NodeParams supersedes this for the wire format;
NodeType is kept for client-side ergonomics and generated convenience APIs.
InferenceEngine ¶
Bases: IntEnum
Inference backend. Mirrors the FlatBuffers InferenceEngine enum.
ModelStatus ¶
Bases: IntEnum
Status of a model as reported by the server.
ModelInfo
dataclass
¶
Summary information for one model returned by list_models().
StringStorageKind ¶
Bases: IntEnum
Kind of a StringStorage. Mirrors StringStorageKind in messages.fbs.
NodeDesc
dataclass
¶
GraphDescription ¶
Fluent builder for a workflow graph sent to the server.
Typical usage::
graph = (GraphDescription()
.add_generate("gen", GenerateParams(
model_name="qwen2.5-0.5b-instruct",
system_prompt="You are helpful."))
.set_start_node("gen"))
agent = await client.create_agent(graph)
tryll_client.errors¶
tryll_client.errors ¶
Exception type for Tryll client errors.
All failures surfaced by the synchronous client — server-reported errors,
protocol-level decode failures, and timeouts — are raised as
:class:TryllError. Server-reported errors carry a numeric code that
matches the ranges defined in server/common/include/tryll/ErrorCodes.h;
client-originated errors (timeouts, closed socket) carry code == 0.
TryllError ¶
Bases: Exception
Raised on server errors, protocol errors, or timeouts.
Attributes:
| Name | Type | Description |
|---|---|---|
code |
Numeric error code from the server's |
Initialise a Tryll client error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
str
|
Human-readable description of the failure, typically
copied from the server's |
required |
code
|
int
|
Numeric error code from |
0
|
tryll_client.managed_server¶
tryll_client.managed_server ¶
managed_server — spawn a tryll_server child process and wait for TCP readiness.
Typical usage::
from pathlib import Path
from tryll_client import TryllClient, ManagedServer
with ManagedServer.start(exe=Path("C:/tryll/tryll_server.exe"), port=9100) as srv:
client = TryllClient.connect(srv.host, srv.port)
client.configure_session(...)
# … use client …
client.shutdown()
No executable discovery is performed here. Pass the path to the server
executable explicitly (or use the helpers in qa-and-eval/tryll_qa to
locate it).
ManagedServer ¶
ManagedServer(*, exe, host='127.0.0.1', port=9100, cwd=None, extra_args=(), stdout=None, stderr=None, start_timeout=30.0, stop_timeout=8.0)
RAII handle around a child tryll_server process.
Use as a context manager (recommended), or call :meth:stop explicitly::
with ManagedServer.start(exe=Path("..."), port=9100) as srv:
client = TryllClient.connect(srv.host, srv.port)
...
No executable discovery is performed — pass the path explicitly.
The server is always launched with --port <port> so the caller's port
takes precedence over whatever is in server-config.json.
start
classmethod
¶
start(*, exe, host='127.0.0.1', port=9100, cwd=None, extra_args=(), stdout=None, stderr=None, start_timeout=30.0, stop_timeout=8.0)
Spawn the server and block until its TCP port is ready.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
exe
|
Path | str
|
Path to |
required |
host
|
str
|
Host string for the TCP ready-probe (default |
'127.0.0.1'
|
port
|
int
|
TCP port; passed as |
9100
|
cwd
|
Path | None
|
Working directory for the child process (defaults to
|
None
|
extra_args
|
Sequence[str]
|
Additional CLI arguments appended after |
()
|
stdout
|
Path | None
|
File path for child stdout redirection ( |
None
|
stderr
|
Path | None
|
File path for child stderr redirection ( |
None
|
start_timeout
|
float
|
Seconds to wait for the TCP port to open (default 30). |
30.0
|
stop_timeout
|
float
|
Seconds to wait for clean exit on :meth: |
8.0
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
'ManagedServer'
|
class: |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If exe does not exist. |
TimeoutError
|
If the port does not open within start_timeout. |
OSError
|
If the process cannot be spawned. |
stop ¶
Terminate the child process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
timeout
|
float | None
|
Seconds to wait before force-killing (defaults to the stop_timeout supplied at construction). |
None
|
wait_for_tcp ¶
Block until host:port accepts a TCP connection or timeout_sec expires.
Raises :class:TimeoutError if the port does not open in time.
start_server_process ¶
Start exe as a child process, passing --port <port> on the command line.
cwd defaults to the executable's directory (where data/ usually lives).
log_dir is the directory where server_stdout.txt and
server_stderr.txt are written. Defaults to the executable's directory.
Pass Path(os.devnull) to suppress output entirely.
extra_args is appended to the command line after --port <port>; use it
to pass server-startup flags such as --disable-telemetry.
stop_server_process ¶
Terminate proc; kill if it does not exit within terminate_timeout.