Skip to content

Retrieve filter grammar

The Retrieve node's filter parameter is a JSON document that the server compiles into a typed predicate and applies against each candidate record's metadata during vector search. Filters are validated against the knowledge base's metadata schema at compile time — unknown fields, wrong types, and malformed grammar are rejected before the filter takes effect.

Top-level node shape

Every filter node is a JSON object with an op key. The supported ops:

op Shape Meaning
"and" {"op":"and","args":[...]} All children must evaluate to true.
"or" {"op":"or","args":[...]} At least one child must evaluate to true.
"not" {"op":"not","arg":{...}} Invert the child.
"eq" / "ne" {"op":"eq","lhs":<op>,"rhs":<op>} Equality / inequality.
"lt" / "le" / "gt" / "ge" same shape as eq Ordering; numeric only.
"in" {"op":"in","needle":<op>,"haystack":<op>} Membership; haystack must be a set<string> field or an array literal.

Operands

An operand is one of:

{ "knowledge": "<field-name>" }     // reference a knowledge-side field
{ "value":     <json-literal> }     // scalar literal (int | float | string | bool)
                                    //   or string-array literal (only as `in.haystack`)

The agent namespace is reserved but not implemented yet — supplying {"agent": "..."} returns filter.unknown_namespace. In Phase 1, embed agent state as {"value": ...} literals when building the filter.

The four most common patterns

// 1. agent.level >= knowledge.min_level
//    Client knows player_level = 12.
{ "op": "ge",
  "lhs": { "value": 12 },
  "rhs": { "knowledge": "min_level" } }

// 2. agent.character_class == knowledge.character_class
{ "op": "eq",
  "lhs": { "value": "warrior" },
  "rhs": { "knowledge": "character_class" } }

// 3. agent.character_class IN knowledge.character_classes
{ "op": "in",
  "needle":   { "value":     "warrior" },
  "haystack": { "knowledge": "character_classes" } }

// 4. knowledge.quest_reached IN agent.quests_reached
//    Client embeds its quest set as an array literal.
{ "op": "in",
  "needle":   { "knowledge": "quest_reached" },
  "haystack": { "value":     ["intro", "rescue_villager", "lost_amulet"] } }

Composing the gates:

{ "op": "and", "args": [
    { "op": "ge",
      "lhs": { "value": 12 },
      "rhs": { "knowledge": "min_level" } },
    { "op": "in",
      "needle":   { "value":     "mage" },
      "haystack": { "knowledge": "character_classes" } }
] }

Validation rules

A filter is rejected at compile time (returning error 3007 InvalidParamValue) if any of:

  • An op value is not in the supported set.
  • An operand uses the reserved agent namespace.
  • A {"knowledge": "..."} reference names a field absent from the schema, or a field declared filterable: false.
  • Operand types are incompatible:
  • lt/le/gt/ge on non-numeric operands.
  • in whose haystack is neither a set<string> field nor an array literal.
  • in whose needle type does not match the haystack's element type.
  • eq/ne on mismatched types (with intfloat promotion as the only permitted exception).
  • An array literal appears anywhere other than as the haystack of an in.
  • and/or args array is empty.
  • Tree depth exceeds 16 or total node count exceeds 256.

Errors include the JSON path of the offending node — e.g. filter.type_mismatch: at args[1].rhs: cannot compare string with int.

Missing-field semantics

When the schema declares a field as optional: true and a record omits the field, every comparison or in operation referencing that field on that record evaluates to false. This is intentional — to mean "no restriction applies" the filter author should encode that explicitly with or and a separate gating field.

Mutating the filter at runtime

filter is a mutable parameter:

agent.change_param("retrieve", "filter", json.dumps({
    "op": "ge",
    "lhs": {"value": player.level},
    "rhs": {"knowledge": "min_level"},
}))

The previous filter remains in effect until the new JSON compiles successfully. Sending an empty string clears the filter.