Runbook — enforce unique (org_id, external_ref) on tenants
Status: prepared, not yet applied to production. Migration 0024_tired_stick is committed but must be applied manually, after the dedup below.
Why
resolveOrCreateTenantByRef (src/lib/api/tenant-resolve.ts) documents and relies on "a unique index on (org_id, external_ref)" as its race-safety source of truth, and its insert-conflict branch matches tenants_org_external_ref. That constraint never existed — the schema only declared a non-unique index("tenants_external_ref_idx"). Consequences:
- The race-recovery branch is dead code.
- Concurrent first-touch requests for the same
external_ref(e.g. a platform like Ezzeefy onboarding a shop viaX-Sendoka-Tenant-Ref) can create duplicate tenants, and the failing insert path threw uncaught → raw 500 (the symptom fixed separately in thewithApiAuthtry/catch hotfix).
This change adds the missing unique("tenants_org_external_ref_unique") so the data model and the code agree.
Apply order (must be in this order)
- Audit. Run Step 1 of
scripts/dedup-tenants-by-external-ref.sqlagainst production (read-only).- Zero rows → skip to step 3. The bug may never have fired; nothing to clean.
- Dedup (only if step 1 found duplicates). Follow steps 2–5 of the script. Per duplicate group: pick a survivor (oldest active), repoint the 14 tenant-scoped tables, resolve the per-tenant unique collisions by hand (
tenant_usage,suppressions,templates,audiences,messaging_pools), delete the loser, all in one transaction. Snapshot the DB first. - Apply the migration. Once the audit returns zero rows:
or apply the single statement directly:DATABASE_URL=<prod> npm run db:migrate # applies 0024 (prod is at 0023)
TheALTER TABLE tenants ADD CONSTRAINT tenants_org_external_ref_unique UNIQUE (org_id, external_ref); -- (and DROP INDEX tenants_external_ref_idx; — folded into 0024)ADD CONSTRAINTfails if any duplicate non-nullexternal_refremains — that failure is the safety net telling you dedup isn't complete.
Notes
external_refis nullable; Postgres treats NULLs as distinct, so tenants created without a ref (e.g. viaPOST /tenantswith noexternal_ref) are unaffected — many NULLs per org remain legal.- After this,
resolveOrCreateTenantByRef's conflict-recovery becomes real: a lost race surfaces the unique violation and re-selects the winner instead of creating a duplicate. - Confirm the original production stack trace (incident ~`2026-06-25T11:15:10Z
) names this insert — thewithApiAuthhotfix now routes it tocaptureException`.