Retrieve filter grammar¶
The Retrieve node's filter parameter is a JSON document that the server compiles
into a typed predicate and applies against each candidate record's metadata during
vector search. Filters are validated against the knowledge base's
metadata schema at compile time —
unknown fields, wrong types, and malformed grammar are rejected before the filter
takes effect.
Top-level node shape¶
Every filter node is a JSON object with an op key. The supported ops:
op |
Shape | Meaning |
|---|---|---|
"and" |
{"op":"and","args":[...]} |
All children must evaluate to true. |
"or" |
{"op":"or","args":[...]} |
At least one child must evaluate to true. |
"not" |
{"op":"not","arg":{...}} |
Invert the child. |
"eq" / "ne" |
{"op":"eq","lhs":<op>,"rhs":<op>} |
Equality / inequality. |
"lt" / "le" / "gt" / "ge" |
same shape as eq |
Ordering; numeric only. |
"in" |
{"op":"in","needle":<op>,"haystack":<op>} |
Membership; haystack must be a set<string> field or an array literal. |
Operands¶
An operand is one of:
{ "knowledge": "<field-name>" } // reference a knowledge-side field
{ "value": <json-literal> } // scalar literal (int | float | string | bool)
// or string-array literal (only as `in.haystack`)
The agent namespace is reserved but not implemented yet — supplying
{"agent": "..."} returns filter.unknown_namespace. In Phase 1, embed agent
state as {"value": ...} literals when building the filter.
The four most common patterns¶
// 1. agent.level >= knowledge.min_level
// Client knows player_level = 12.
{ "op": "ge",
"lhs": { "value": 12 },
"rhs": { "knowledge": "min_level" } }
// 2. agent.character_class == knowledge.character_class
{ "op": "eq",
"lhs": { "value": "warrior" },
"rhs": { "knowledge": "character_class" } }
// 3. agent.character_class IN knowledge.character_classes
{ "op": "in",
"needle": { "value": "warrior" },
"haystack": { "knowledge": "character_classes" } }
// 4. knowledge.quest_reached IN agent.quests_reached
// Client embeds its quest set as an array literal.
{ "op": "in",
"needle": { "knowledge": "quest_reached" },
"haystack": { "value": ["intro", "rescue_villager", "lost_amulet"] } }
Composing the gates:
{ "op": "and", "args": [
{ "op": "ge",
"lhs": { "value": 12 },
"rhs": { "knowledge": "min_level" } },
{ "op": "in",
"needle": { "value": "mage" },
"haystack": { "knowledge": "character_classes" } }
] }
Validation rules¶
A filter is rejected at compile time (returning error
3007 InvalidParamValue) if any of:
- An
opvalue is not in the supported set. - An operand uses the reserved
agentnamespace. - A
{"knowledge": "..."}reference names a field absent from the schema, or a field declaredfilterable: false. - Operand types are incompatible:
lt/le/gt/geon non-numeric operands.inwhosehaystackis neither aset<string>field nor an array literal.inwhoseneedletype does not match the haystack's element type.eq/neon mismatched types (withint↔floatpromotion as the only permitted exception).- An array literal appears anywhere other than as the
haystackof anin. and/orargsarray is empty.- Tree depth exceeds 16 or total node count exceeds 256.
Errors include the JSON path of the offending node — e.g.
filter.type_mismatch: at args[1].rhs: cannot compare string with int.
Missing-field semantics¶
When the schema declares a field as optional: true and a record omits the
field, every comparison or in operation referencing that field on that
record evaluates to false. This is intentional — to mean "no restriction
applies" the filter author should encode that explicitly with or and a
separate gating field.
Mutating the filter at runtime¶
filter is a mutable parameter:
agent.change_param("retrieve", "filter", json.dumps({
"op": "ge",
"lhs": {"value": player.level},
"rhs": {"knowledge": "min_level"},
}))
The previous filter remains in effect until the new JSON compiles successfully. Sending an empty string clears the filter.