Idempotency and ordering¶

Every event-shaped payload in the contract carries a partner-generated correlation_id. UUID v4 or UUID v7 is recommended; v7 sorts by time, which makes operational triage easier.

Idempotency keys¶

Surface	Idempotency key
Realtime calls (executor → planner)	`(partner_id, correlation_id)`
Confirmation calls (executor → planner)	`(partner_id, correlation_id)`
Webhook dispatch (planner → executor)	`(planner_id, correlation_id)`
Poll-mode dispatch acks (executor → planner)	`(executor_id, cursor)`

A replay returns 200 OK with replay: true where the response schema supports it, and produces no additional state mutation: no extra movement, no extra exception, no duplicate work-status, no duplicate supervisor task, no double dispatch.

Retention¶

The receiving side must retain enough idempotency state to absorb the longest reasonable retry window. Recommended retention:

30 days for realtime and confirmation correlation_ids on the planner.
24 hours for webhook correlation_ids on the executor, matching the webhook DLQ horizon.

Trim older entries on a rolling basis. The retention is per-partner_id per-correlation_id; absolute storage cost is modest even at high throughput.

Ordering guarantees¶

Channel	Ordering
Kafka dispatch	Preserved per warehouse via partition key on `warehouse_frn`.
Webhook dispatch	Serialized per `(planner_id, warehouse_id)`. The next event is not delivered until the previous one is acknowledged or DLQ’d.
Polling dispatch	Cursor-ordered. Callers must process pages strictly in order and acknowledge only after durable processing.
Realtime	Best-effort. Movements may arrive out of wall-clock order; the planner reconciles via the confirmation path.
Confirmation	Strictly per-window. Two confirmations for the same window key are deduplicated.

Cross-warehouse ordering is not guaranteed in any channel and should not be relied on.

Replay vs duplicate detection¶

A replay is intentional — the sender retried after a network failure and got the same (partner_id, correlation_id) accepted twice. The contract returns 200 OK with replay: true.

A duplicate is a sender bug — different correlation_ids for what is logically the same event. The contract has no way to detect duplicates at the protocol level; it is the sender’s responsibility to use the same correlation_id across all retries of the same logical action.

Common sender mistakes that produce duplicates:

Generating a new UUID on each retry attempt instead of reusing the original.
Restarting the dispatch publisher and re-emitting from a stale offset without remembering already-sent correlation_ids.
Splitting one logical movement into two physical retries with two different correlation_ids when a partial write succeeded.

If you see ledger drift after rollout, audit your correlation_id generation first. The contract’s idempotency surface is robust; bugs almost always originate sender-side.

Cursor advancement (poll mode)¶

Poll-mode dispatch uses a cursor instead of correlation-id idempotency:

GET  /wes/v1/dispatch/pending?since={cursor}&limit=100
POST /wes/v1/dispatch/ack    { "cursor": "<cursor>" }

The cursor opaquely encodes the planner’s internal commit position. The executor must:

GET /dispatch/pending with the last acked cursor.
Process the returned page durably (the events become facts on the executor side before the ack).
POST /dispatch/ack with the page’s next_cursor.
Repeat until next_cursor is empty or equal to the last acked cursor.

Acking before durable processing means lost events on a crash. Acking a cursor that is earlier than the last one acked is a no-op — cursors are monotone.