Runbooks

Operational runbooks

First-response playbooks for the most common production issues on Chest Gate.

Last updated April 22, 2026

Each runbook lists the trigger that surfaces the issue, the checks to run in order, and the recovery steps. Work top to bottom. If a check clears, the scenario is not this one. Move on.

Webhook failure rate spike

Trigger: delivery success drops below 95% for 10+ minutes.

Checks

Verify the downstream endpoint is reachable and its TLS certificate is valid.
Inspect response body and status in the delivery log.
Confirm webhook secret rotations are synced with the downstream consumer.

Recovery

Replay failed deliveries once the endpoint is back.
If a secret was compromised, rotate it and notify the consumer.

Missing referrer payouts

Trigger: calls are attributed but the referrer share is zero on settled rows.

Checks

Confirm the attribution header or API key is present on the paid request.
Confirm the deployment has a non-zero referrer bps.
Check whether the rows are actually settled or still distribute_failed.

Recovery

Fix the attribution source and run one fresh paid-call test to confirm.
Retry failed distributions and verify the final split amounts on-chain.

Attribution mismatch report

Trigger:partner-reported volume doesn't match your dashboard.

Checks

Compare partner logs against call IDs and timestamps in the dashboard.
Validate the wallet / handle mapping the integration is actually using.
Check for downstream filtering or delayed event ingestion on the partner side.

Recovery

Replay affected deliveries for the partner to ingest.
Share an export snapshot for joint reconciliation.