diff --git a/CHANGELOG.md b/CHANGELOG.md index ad47b45..b4f96e4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -167,6 +167,25 @@ version 2: its `{{ … }}` sequences become substitution points, and its ## OVOS-MSG-1 — Bus Message +### 2 + +- §2.1.1 — the topic convention made the single authoritative rule + every topic-defining specification inherits: a `:` in a topic marks + a **dispatch-shaped** topic assembled from identifiers (canonical + shape `:`), definable only by a formal + specification; all other topics use the dotted + `..` form and MUST NOT contain `:`. Separator-hygiene + rules for identifiers used as topic components restated per shape. +- §2 — unknown top-level keys: producers MUST NOT emit them; + consumers SHOULD treat such a Message as malformed but MAY ignore + the unknown keys, keeping strictness on the producer side. §7 + consumer conformance aligned. +- §5.2 — `reply` over an array `destination`: selecting the first + element as the new `source` is RECOMMENDED for deterministic + convergence; the choice remains implementation-defined and + consumers still MUST NOT rely on it. +- §3.1 walkthrough marked informative. + ### 1 - Initial draft. Formalizes existing OVOS bus behaviour as a single @@ -401,6 +420,23 @@ version 2: its `{{ … }}` sequences become substitution points, and its - §9.6 — the OPTIONAL `listen` field on `ovos.utterance.speak`: when `true`, the output stage re-opens the user input channel after the response is delivered. +- Consistency and design review: §4/§9.1 — when the entry topic carries + no authoritative `lang`, the orchestrator MUST resolve the utterance + language once (OVOS-SESSION-1 §3.2 evidence) and pass the resolved tag + to every plugin's `match` call; plugins MAY refine but MUST NOT + re-derive independently (`Match.lang` remains the plugin's + declaration). §4.4 — RECOMMENDED default match-phase timeout of 10 s; + an applied bound MUST be at least any stage-internal collection + ceiling. §6.1 — context decay aligned with OVOS-CONTEXT-1 §4: the + post-match `turns_remaining` decrement runs after the match round + whether or not any intent matched, with freshly written entries + exempt; promotion citations corrected to CONTEXT-1 §5.1. §6.5 — + orchestrator liveness: the bus loop MUST keep servicing subscriptions + (including poll replies for an in-flight plugin) while a `match` call + is in flight. §7.1/§7.3 — `active_handlers` stamping suppression MUST + key on the Match's reserved `intent_name`, never the producing + `pipeline_id`. SESSION-1 registry citations corrected to §2.2; + reservation wording made timeless. ### 1 diff --git a/msg-1.md b/msg-1.md index 3e2e70d..45b2555 100644 --- a/msg-1.md +++ b/msg-1.md @@ -1,6 +1,6 @@ # Bus Message Specification -**Spec ID:** OVOS-MSG-1 · **Version:** 1 · **Status:** Draft +**Spec ID:** OVOS-MSG-1 · **Version:** 2 · **Status:** Draft This document defines the **bus message** — the single unit of communication exchanged between components of a voice-assistant @@ -92,8 +92,14 @@ A Message is a **JSON object** with exactly these top-level keys: Producers **MAY** omit `data` and/or `context` when they would be empty; consumers **MUST** treat an absent `data` or `context` as -equivalent to `{}`. Other top-level keys **MUST NOT** appear; -consumers **MUST** reject any Message with unknown top-level keys. +equivalent to `{}`. Producers **MUST NOT** emit any other top-level +key. A consumer that receives a Message with unknown top-level keys +**SHOULD** treat it as malformed, but **MAY** instead ignore the +unknown keys and process the envelope normally. The asymmetry is +deliberate: strictness belongs on the producer side, where the +defect originates; a hard consumer-side reject would let a single +non-conformant emitter sever otherwise-valid traffic for every +consumer on the bus. ### 2.1 `type` @@ -104,39 +110,52 @@ match the syntax: - no whitespace; - lowercase RECOMMENDED for new topics. -Dot- and colon-separated segments are common in topics — -`assistant.intent.register.keyword`, `XXX.response` — and have no -normative semantics here; segmenting is a convention used by the -specifications that define topics, not a feature of the envelope. - -#### 2.1.1 Identifiers used as topic components - -Some specifications define topics whose `type` string is assembled from -named identifiers at runtime — for example `:` -or `.`. For such a topic to be -unambiguously parseable, the identifiers it uses as components **MUST -NOT** contain the separator character(s) the topic uses structurally: - -- a topic shaped `:` requires A and B to not contain `:`; -- a topic shaped `.` requires A and B to not contain `.`; -- a topic shaped `.:` requires A and B to not contain `.` - and B and C to not contain `:`. - -Each specification declares only what its own separator requires. -No separator character is globally forbidden in all identifiers; -identifiers used in topics that do not use that character as a -structural separator may contain it freely. +Dots segment a topic into a readable hierarchy — +`assistant.intent.register.keyword`, `XXX.response`. The dot has no +normative semantics in the envelope; hierarchy depth and segment +meaning are conventions of the specifications that define topics. +The colon, by contrast, **is** normatively reserved — §2.1.1 defines +the rule, which every topic-defining specification inherits. + +#### 2.1.1 The topic convention: colon vs. dot + +Two topic shapes exist on the bus, distinguished by one character: + +1. **Dispatch topics** contain a `:` and are assembled at + runtime from identifiers — the canonical shape is + `:`, the per-intent handler-dispatch + topic. The `:` **is the marker** that a topic addresses a + specific registered handler rather than naming an event. Only a + formal specification **MAY** define a colon-bearing topic shape, + and it **MUST** define the identifier roles on each side of the + `:`. +2. **All other topics** — events, requests, responses, lifecycle + signals — use the dotted form `..` (any depth) and + **MUST NOT** contain `:`. + +Consequently a consumer **MAY** classify any topic by a single test: +a `:` anywhere in `type` means a dispatch-shaped topic per the +specification that defined that shape; no `:` means an ordinary +dotted topic. + +**Separator hygiene.** An identifier used as a component of a topic +**MUST NOT** contain the character(s) the topic shape uses +structurally: + +- in `:`, neither A nor B may contain `:`; +- in `.`, neither A nor B may contain `.`; +- shapes combining both separators impose both constraints on the + components they delimit. + +Each topic-defining specification declares only what its own +separators require of its own identifiers; a character is +constrained only where it is structural. **Recommended identifier form.** When defining a new identifier intended for use as a topic component, prefer values that contain only ASCII letters, digits, `_`, and `-`. This avoids accidental collision with any separator a current or future topic shape may choose. -**Colon convention.** The `:` character is reserved for use by formal -specifications as a structural separator in topic shapes. Informally -defined or application-specific topics **SHOULD** avoid `:` in their -topic name so the convention remains unambiguous. - ### 2.2 `data` `data` is a JSON object. It **MAY** be empty (`{}`). Its keys, value @@ -179,7 +198,7 @@ any other external participant on the bus). Together they tell every observer which direction a Message is travelling across that boundary at any given moment. -### 3.1 The boundary, illustrated +### 3.1 The boundary, illustrated (informative) A typical end-to-end flow, showing how the routing pair flips as the Message crosses the boundary: @@ -330,9 +349,11 @@ Produces a new Message: `C.destination`; - and is an array of strings, the new context's `source` **MAY** be set to the identifier of the component producing the reply - (typically one of the array entries). The exact choice is - implementation-defined; consumers **MUST NOT** rely on a - particular member being chosen. + (typically one of the array entries). Selecting the **first + element** is RECOMMENDED, so that independently written + components converge on the same deterministic choice; the + choice remains implementation-defined, and consumers **MUST + NOT** rely on a particular member being chosen. 3. All other `context` keys, including `session` (§4), are preserved unchanged. As with `forward`, if the source Message has no `session`, the derivation **MAY** populate a default @@ -434,8 +455,10 @@ A producer **SHOULD**: ### A **consumer** of Messages **MUST**: -- reject a Message that violates §2 (wrong top-level keys, wrong - types, missing or non-string `type`) as malformed; +- treat a Message that violates §2 (wrong top-level value types, + missing or non-string `type`) as malformed; for unknown top-level + keys the consumer **SHOULD** treat the Message as malformed but + **MAY** ignore the unknown keys (§2); - treat an absent `data` or `context` as equivalent to `{}` (§2); - tolerate any `context` shape, including an empty object, and ignore `context` keys it does not understand (§2.3); diff --git a/pipeline-1.md b/pipeline-1.md index 5ea89d4..b4fc4c4 100644 --- a/pipeline-1.md +++ b/pipeline-1.md @@ -76,7 +76,7 @@ It does **not** define: - **the `session` lifecycle** — `session` is carried opaquely per OVOS-MSG-1 §4. The session fields this spec owns are listed in §5; other internal fields are owned by other specifications via the - OVOS-SESSION-1 §2.1 registry mechanism. + OVOS-SESSION-1 §2.2 registry mechanism. - **per-plugin behavioural specs** — plugins have no behavioural contract beyond §4. A `converse` plugin, a `fallback` plugin, a persona plugin, a language-model plugin, a chatbot plugin: each @@ -177,14 +177,22 @@ Inputs: language. A plugin is free to consider all candidates, only the first, or any subset; the orchestrator does not prescribe how candidates are weighted. -- `lang` — the **optional** BCP-47 content-language hint sourced - from `Message.data.lang` of the entry-topic (§9.1). Present only - when the producer authoritatively knew the content language; - absent otherwise. The orchestrator **MUST NOT** synthesize a - value. The plugin uses this as input to its own language - resolution — consulting `session` (OVOS-SESSION-1 §3.2) or - applying any other policy — and **MUST** declare the resolved - language in `Match.lang`. +- `lang` — the BCP-47 content-language tag. When the entry-topic + (§9.1) carried an authoritative `Message.data.lang`, the + orchestrator passes it through. When it did not, the + orchestrator **MUST** resolve the utterance language **once** + per utterance, from the per-utterance evidence fields of + OVOS-SESSION-1 §3.2 (user preference, lang-detect signals), and + pass the resolved tag to **every** plugin's `match` call for + that utterance. A single resolution point keeps the match round + coherent: if each plugin re-derived language independently, the + same utterance could be matched in different languages at + different pipeline stages, and which language "wins" would be an + accident of ordering. A plugin **MAY** refine the received tag + (e.g. a multilingual matcher that detects a different content + language) but **MUST NOT** re-derive it independently from + session evidence, and **MUST** declare the language it actually + matched in via `Match.lang`. - `session` — the session carrier from `context.session` of the utterance Message (OVOS-MSG-1 §4, OVOS-SESSION-1). @@ -294,7 +302,13 @@ value. Because §4.2 permits a plugin to communicate over the bus during etc.), the call can block for an unbounded time. The orchestrator **SHOULD** bound each `match` invocation by a -deployment-defined time. If a plugin has not returned within the bound, +deployment-defined time. The **RECOMMENDED** default is **10 s**. +When a bound is applied, it **MUST** be at least as large as any +collection ceiling a stage runs internally (e.g. the +OVOS-COMMON-QUERY-1 §2.1 collection window) — a match-phase timeout +shorter than a stage's own internal wait guarantees that stage is +killed mid-collection on every utterance it handles. If a plugin +has not returned within the bound, the orchestrator **MUST** treat the call as if the plugin had raised an exception — log the timeout, skip to the next plugin per §6.2, and continue normally. Any partial mutation performed by the plugin during the @@ -328,7 +342,7 @@ discipline. ## 5. Session fields owned by this specification This specification claims four session fields per OVOS-SESSION-1 -§2.1: one **positive** ordering field (§5.1 `pipeline`) and three +§2.2: one **positive** ordering field (§5.1 `pipeline`) and three **negative** filtering fields (§5.2 `blacklisted_pipelines`, §5.3 `blacklisted_skills`, §5.4 `blacklisted_intents`). All four are session-scoped, propagate with the session under OVOS-SESSION-1 §4, @@ -599,11 +613,10 @@ ovos.utterance.handle ← entry (§9.1) │ session = match.updated_session or session # §4.1, §4.2 │ │ ┌── post-match-pre-dispatch window ──────────────┐ - │ │ engine-side context promotion (CONTEXT-1 §5.3) │ + │ │ engine-side context promotion (CONTEXT-1 §5.1) │ │ │ intent-transformer chain runs (TRANSFORM-1 │ │ │ §3.4) — may modify Match.slots, MUST NOT │ │ │ change skill_id / intent_name │ - │ │ post-decay turns_remaining-- (CONTEXT-1 §4) │ │ └────────────────────────────────────────────────┘ │ │ ovos.intent.matched (§9.2) @@ -616,9 +629,14 @@ ovos.utterance.handle ← entry (§9.1) │ (dialog-transformer chain ← TRANSFORM-1 §3.5) │ (tts-transformer chain ← TRANSFORM-1 §3.6) │ - └─ if no plugin matched (or all matches filtered): - ovos.intent.unmatched (§9.3) - ovos.utterance.handled (§9.5) + ├─ if no plugin matched (or all matches filtered): + │ ovos.intent.unmatched (§9.3) + │ ovos.utterance.handled (§9.5) + │ + └─ post-match decrement turns_remaining-- ← CONTEXT-1 §4 + (runs after the match round whether or not any intent + matched; entries freshly written this round — CONTEXT-1 + §4.1 — are exempt) ``` The flow diagram shows where companion-spec chains plug into this @@ -628,7 +646,7 @@ the entry topic is emitted and is therefore not visible here. The **utterance** and **metadata** transformer chains run after entry and before iteration, against the candidate utterance list. The **post-match-pre-dispatch window** is where -CONTEXT-1 §5.3 sanctions engine-side `session.intent_context` +CONTEXT-1 §5.1 sanctions engine-side `session.intent_context` mutation and where TRANSFORM-1 §3.4 inserts the intent-transformer chain over the chosen `Match`. **`ovos.utterance.handled` is emitted at handler completion** — immediately after `ovos.intent.handler.complete` (or `.error`). @@ -796,6 +814,14 @@ the duration of a handler invocation will deadlock the first time any handler waits for a user reply. Concurrent utterance processing is a structural requirement, not an optimisation. +The same liveness requirement applies within the match phase: the +orchestrator **MUST** continue servicing its bus subscriptions — +including poll replies destined for an in-flight plugin (a stop +plugin's pongs, a converse or fallback poll's responses) — while a +`match` call is in flight. An orchestrator whose bus loop blocks on +the synchronous `match` return deadlocks every plugin whose match +strategy involves a bus round-trip (§4.2). + The session is the correlation key for nested lifecycles: the inner utterance carries the same `session_id` with `session.response_mode` populated (OVOS-CONVERSE-1 §5), which @@ -868,8 +894,10 @@ The dispatch Message's `context` (OVOS-MSG-1 §4): continuation of an already-active skill's participation or its termination, not a fresh activation. The orchestrator applies the polymorphism rule (§7.0) uniformly and does not otherwise - distinguish skill from pipeline-plugin dispatches; suppression - is keyed strictly off the reserved-name registry. The push is + distinguish skill from pipeline-plugin dispatches: suppression + **MUST** be keyed on the Match's `intent_name` appearing in the + §7.3 reserved-name registry — never on the producing + `pipeline_id`. The push is applied after `Match.updated_session` is committed: a plugin that mutates `active_handlers` via `updated_session` (e.g., STOP-1's global stop wiping the list) sees the stamp applied @@ -933,11 +961,13 @@ to ordinary dispatches. The one exception is the `session.active_handlers` push defined in §7.1, which is suppressed on reserved-name dispatches — a reserved name represents a continuation or termination of an already-active -skill's participation, not a fresh activation. The reserving +skill's participation, not a fresh activation. Stamping +suppression is keyed on the Match's reserved `intent_name` (this +registry), never on the producing `pipeline_id`. The reserving specification gets exclusive use of the name across the deployment's skill set; it gets no other privilege. -Reservations currently in force: +Reserved intent_names: | Reserved intent_name | Reserving spec | Meaning of a Match bearing this name | |----------------------|----------------|--------------------------------------| @@ -950,7 +980,7 @@ Reservations currently in force: This specification fixes only the registry mechanism (reservation listing); the per-name semantics are owned by the reserving specification. Other specifications MAY reserve further names by -adding rows to this table in their own PR. +adding rows to this table in a revision of this specification. A plain skill (§7.0) subscribes to a reserved-name dispatch topic via framework convention rather than OVOS-INTENT-4 registration — @@ -1067,7 +1097,7 @@ Payload shape: | Field | Type | Required | Meaning | |-------|------|----------|---------| | `utterances` | array of strings | yes | One or more candidate utterance strings. | -| `lang` | string | no | BCP-47 language tag of the utterance. **Present only when the producer authoritatively knows the content language** (e.g. a chat client emitting text it locally typed in `de-DE`, or an audio service emitting text from an STT decoder run in `en-US`). When absent, the content language is **not authoritatively known**; the orchestrator **MUST NOT** synthesize a value (in particular, **MUST NOT** fall back to `session.lang` or any per-utterance language signal of OVOS-SESSION-1 §3.2). The absence is propagated through to consumers (pipeline plugins, transformers, skills), each of which decides how to resolve language per its own policy — typically by consulting OVOS-SESSION-1 §3.2 signals (user preference, lang-detect signals) and applying its stage-appropriate consolidation. | +| `lang` | string | no | BCP-47 language tag of the utterance. **Present only when the producer authoritatively knows the content language** (e.g. a chat client emitting text it locally typed in `de-DE`, or an audio service emitting text from an STT decoder run in `en-US`). When absent, the content language is **not authoritatively known**; producers **MUST NOT** synthesize a value on the wire. On receipt, the orchestrator **MUST** resolve the language once from OVOS-SESSION-1 §3.2 evidence fields and pass the resolved tag to every plugin's `match` call (§4) — a single resolution point keeps all stages matching in the same language; plugins MAY refine but MUST NOT re-derive independently. | `ovos.utterance.handle` is the only entry topic name this specification recognizes. A conformant orchestrator subscribes to