Open Gaps
Residual items after the ninth hardening pass.
Closed in the most recent pass (June 2026 — session 9)
Developer experience
- Engagement webhook events —
message.opened/message.clicked/message.unsubscribed/message.complainednow fan out (track routes, unsubscribe paths, SES complaint). Complaint moved offmessage.bouncedto its own event. - Bulk webhook replay —
POST /v1/webhooks/:id/replay-failed(100/call, oldest-first,replayed_atpaging marker so re-invokes don't duplicate). Docs' broken paths fixed. - v1 test-fire —
POST /v1/webhooks/:id/test-fire(API-key parity with the dashboard button, shared 20/min budget). - Webhook auto-disable — 10 consecutive exhausted deliveries flip
enabled=false+ audit; any 2xx resets the streak; re-enable clears it.last_response_body(1 KB cap) captured per attempt. delivery_idin webhook payload body +X-Sendoka-Delivery-Idheader;Retry-Afteron all 429s.- Message list filters —
to+created_after/beforeon v1 emails/sms + internal; date-range filter UI on the dashboard (the dead Filter button). - Suppressions dashboard —
/overview/suppressions+/api/internal/suppressionsCRUD. - Resend —
POST /api/internal/messages/:id/resendclones as due-nowscheduledso the hardened cron path does provider/usage/fanout; button on message detail. - Verify-webhooks recipe rewritten — old doc showed a
t=...,v1=...signature format that never existed on the wire; now V2-first with Python + Go examples. - Test-mode semantics documented — already correct in code (env-split usage, no quota impact, webhooks fire); docs/api/authentication.md now says so explicitly.
Enterprise
- SSO enforcement (
sso_connections.enforce_sso) — SSO-only login, membership-scoped with owner break-glass and verified-domain proof for unknown emails; enabling bumps members'session_version. - Session policies —
session_max_age_hours+max_sessions_per_useron organizations, strictest-across-memberships, enforced in the JWT callback / at login. - SP metadata endpoint —
GET /api/auth/saml/metadatafor IdP import. - SCIM Groups —
/scim/v2/GroupsCRUD + group→role mapping (owner excluded at zod + rank table + DB CHECK; directorydefault_rolecan no longer beowner, legacy directories clamped at provision time). - Permission pass finished — gate matrix documented in
permissions.ts; sole violation fixed (internal messages DELETE was member+, now developer+); UI gates aligned. - Tenant key minting hardened —
POST /v1/keysnow validatestenant_idagainst the org's tenants. Platform admin token gap was already closed by/v1/keys+/v1/domainsacceptingtenant_id; platforms.md now documents the full no-cookie provisioning flow.
Marketing honesty
- SOC 2 → "in progress", SLA → "uptime target", VPC → "on roadmap"; DPA footer link now resolves to
/legal/dpa.
Migrations this pass: 0017 (webhook auto-disable + response body), 0018 (replayed_at), 0019 (enforce_sso + session policies), 0020 (scim groups), 0021 (mapped_role CHECK).
Closed in the prior pass (May 2026 — session 8)
Cost / abuse protection
- Provisioning rate limit (10/hr/org) —
src/lib/api/provision-gate.tswired into POST/api/v1/{brands,campaigns,phone-numbers}and the internal mirrors. Each AWS provisioning call costs real money; runaway scripts or a compromised key now stop at quota withPROVISION_LIMIT_EXCEEDED. - Idempotency on POST
/v1/brands+/v1/campaigns+/v1/phone-numbers— Idempotency-Key header replays the cached 201 within the TTL window. Closes the dup-side risk that the reconcile cron couldn't fix.
Inbound + observability
- Inbound SMS persistence — new
inbound_smstable +/api/internal/inbound-smslist +inbound.smswebhook event. Free-form replies that aren't STOP/HELP are now persisted and pushed to customer webhooks instead of being dropped. - Per-request structured context —
src/lib/request-context.ts(AsyncLocalStorage) auto-injectsrequest_id/org_id/tenant_id/routeinto everylog.*andlogErrorline within the request lifecycle. Wired intowithApiAuth. - Sentry shim —
src/lib/observability.tsopt-in viaSENTRY_DSNenv +npm i @sentry/nextjs.warn/errorlog lines +logErrorcalls forward to Sentry when configured; no-op otherwise.
Coverage + tests
- sns-sms route end-to-end tests — added inbound-persistence success/miss paths + duplicate-delivery dedupe assertion. 11 tests covering: signature reject, rate limit, topic allowlist, SUCCESS → delivered + fanout, hard-block → bounced + suppress, spurious notification, STOP tenant scope, free-form inbound persistence, no-match drop, duplicate delivery.
- OpenAPI parity for
kind: alphanumeric—PhoneNumber,ProvisionPhoneNumber, newRegisterAlphanumericSenderschemas; POST/phone-numbersrequestBody now usesoneOfwithkinddiscriminator.
Retention
- Audit log archive-to-blob —
/api/cron/retentionacceptsARCHIVE_AUDIT_LOGS=true+BLOB_READ_WRITE_TOKEN: serializes rows > 365d to Vercel Blob (audit-archive/YYYY-MM-DD.json) before delete. Upload failure skips the delete so the next run retries. - Idempotency TTL tuning —
IDEMPOTENCY_TTL_HOURSenv (default 24, range 1–168) tunes the cache window per-deploy.
Sales surface + compliance
/pricingroute — dedicated public page replacing the inline anchor on the home. Three tiers (Hacker / Team / Enterprise) with a full feature comparison table covering the three new enterprise features (Dedicated IPs, India DLT, SAML/SCIM) plus volume bands, compliance (SOC 2, GDPR, HIPAA), and FAQ.- GDPR per-user data export + erasure —
POST /api/internal/user/data-exportreturns profile + memberships + sessions + 2FA + author-attributed audit entries (10k cap).DELETE /api/internal/userconfirms via password OR email match, refuses when the user is the sole owner of an org (SOLE_OWNER409 unlessdelete_owned_orgs: true), anonymizes audit entries rather than dropping them. UI panel on/overview/settings/security.
Enterprise: built out end-to-end
- Dedicated IPs + IP pools — real AWS SESv2 calls (
CreateDedicatedIpPoolMANAGED scaling +CreateConfigurationSet+PutConfigurationSetDeliveryOptions). New/api/cron/ip-pool-warmup(every 15 min) pollsGetDedicatedIpsand advancesprovisioning→warming→active. Send pipeline resolvesdomain → pool → configurationSetand passesConfigurationSetNameintoSendEmailCommand; paused pools reject sends withIP_POOL_PAUSED./api/internal/ip-pools/[id]PATCH + DELETE +/api/internal/ip-pools/[id]/assignmentsPOST/DELETE round out the CRUD. - India DLT registration — Gupshup HTTP client behind
DLT_PROVIDER=gupshupenv (Kaleyra / MSG91 can plug in via the same interface). Headers + templates CRUD at/api/internal/dlt/{headers,templates}. New/api/cron/dlt-status-poll(every 30 min) flipssubmitted→verified/approved/rejectedviapollDltStatus. Send-time gate (enforceDltTemplate) on the v1 SMS route requires anapproveddlt_template_idfor any+91recipient. - SAML / SSO + SCIM — real samlify-based AuthnResponse verification at
/api/auth/saml/callback(signature check againstsso_connections.certificate, JIT user provisioning, one-shot token exchange via a NextAuth CredentialssamlTokenarm). SCIM 2.0/UsersGET/POST +/Users/[id]GET/PATCH/PUT/DELETE with sha256-bearer-token auth againstscim_directories./api/internal/scim(enterprise-only) mints + returns plaintext bearer once.
Closed in the prior pass (May 2026 — session 7)
SMS audit + critical security
- SNS verify cert fetch timeout + 5-min replay window —
src/lib/api/sns-verify.ts. Captured signed payloads can no longer replay indefinitely. - SNS-SMS topic allowlist + SubscribeURL host gate + per-IP rate limit —
src/app/api/webhooks/sns-sms/route.tsrejects notifications from un-allowlisted topics, validatesSubscribeURLagainst the samesns.*.amazonaws.comregex as the cert host, caps 600/min/IP. - STOP suppression now tenant-scoped — was org-wide (multi-tenant leak per the schema's own warning); STOP-from-recipient lookup now returns
(orgId, tenantId)and is passed intoaddSuppression. - SMS hard-block → bounced + auto-suppress —
BLOCKED/EXPIRED/INVALID_NUMBER/OPTED_OUTmap tomessage.bouncedwith auto-suppression (was collapsed tofailedwith no suppression). - v1 batch SMS / email gate parity —
withApiAuth(..., "send:{channel}"),filterSuppressedper item, sender-registration check (SMS), per-item reject for unsupportedtemplate/scheduled_at/from_pool/media_url. - v1
/sms/{id}scope + env filter + audit — GET requiresread:messages, DELETE requiressend:sms, both filter onctx.environment+ctx.tenantId. DELETE emitsmessage.canceled. - Idempotency in-flight lock release —
storeResponseIfIdempotentreleases the Redis lock after the cache row lands so retries within 30s see a 200 replay instead of 409. NewreleaseIdempotencyLockhelper for error paths. - Send/DB ordering hardened — failure-row insert on v1 sms + emails uses
onConflictDoNothingso a partial happy-path insert can't PK-collide and mask the original error;.catch(() => {})swapped forlogErroron usage + insert failures. pickFromNumbertenant scoping fixed — platform-root caller no longer matches every tenant's pool (Drizzleand(...)was silently dropping theundefinedpredicate).- Provider tightening —
AWS.SNS.SMS.SenderIDonly set for non-E.164 originators (US/CA reject the attribute);provisionPhoneNumberderivesMessageTypefrom the campaign use case (marketing → PROMOTIONAL);mapStatusexpanded to recognize VERIFIED / APPROVED / REGISTERED + the failure synonyms DENIED / REQUIRES_AUTHENTICATION / REVOKED. - 403 wording —
requireDeveloperSessionsites stopped returning "Owner role required"; 20 routes now correctly say "Developer or owner role required." - Internal GET-by-id serialize parity — brands / campaigns / phone-numbers detail endpoints now use the same
serialize()shape as the list endpoints. - Audit emission on v1 send routes —
message.sent+message.canceledadded to the AuditAction union, emitted from v1 sms POST + DELETE. - Atomic claim on scheduled-send cron — overlapping cron ticks no longer double-send. Drain uses
UPDATE … RETURNINGwithFOR UPDATE SKIP LOCKEDagainst the inner SELECT; transientsendingstatus acts as the claim marker, stale-claim recovery (>5 min) lets a crashed runner's rows be re-picked. - Reconcile cron for orphan registrations — new
/api/cron/reconcile-registrationsruns every 15 min, retries brand/campaign create when AWS failed at POST time leavingproviderBrandId/providerCampaignIdNULL. Rows older than 7 days flip tofailedwithRECONCILE_TIMEOUT.
Alphanumeric sender path (UK / AU / EU)
- Schema:
phone_numbers.kind(number|alphanumeric), nullablee164, newsender_idtext +(orgId, senderId)unique. Migration 0009. - Validation:
registerAlphanumericSenderSchemaenforces 1–11 chars and rejectsiso_countryin US/CA/IN (carrier rules forbid alphanumeric there). - Routes: POST
/api/v1/phone-numbersand/api/internal/phone-numbersbranch onkind=alphanumeric— no AWS round-trip, row inserted asverifiedimmediately. - Send gate: v1 sms + batch detect non-E.164
fromand look up bysenderIdinstead ofe164. - UI:
/overview/sms/numbersgets a "Register sender ID" button + modal alongside "Provision number"; Provision modal country dropdown limited to US/CA with helper text pointing at the sender-ID path. - Country guard: both POST routes reject
iso_countryoutside{US, CA}forkind=numberwith422 COUNTRY_NOT_PROVISIONABLE, plus a catch-block matcher for AWSINVALID_PARAMETER/isoCountryCodeas a fallback. - Docs:
docs/api/sms.md+/docs#send-smsupdated for both sender kinds; importable Postman collection atdocs/api/sendoka-sms.postman_collection.json+ served frompublic/.
Tests
sns-verify.test,suppression.test,pool.test,sms-registration.test,sms.test— 262 pass / 4 skipped, no regressions.
Closed in the prior pass (April 2026 — session 6)
- SDK packages removed —
packages/sdk-nodeandpackages/sdk-pythondeleted along withdocs/developer-tools/sdks.md. The REST API is the surface contract; we'll revisit official SDKs once we have ≥3 customers asking. Code samples in marketing, docs, and the dashboard quickstart now usecurl/fetch/requestsdirectly.
Closed two passes back (April 2026 — session 5)
- Platform mode (tenants) — new
tenantstable plustenant_idnullable columns on api_keys, domains, webhook_endpoints, messages, suppressions, templates, audiences, contacts, inbound_messages, messaging_pools. All send + CRUD routes filter / stampctx.tenantIdso a tenant-bound key can't cross-send, cross-read, cross-suppress, or cross-pool./api/v1/tenantsCRUD +/tenants/:id/usage; internal/overview/tenantswith usage counts and paginated list. Seedocs/features/platforms.md. - Tenant-scoped audience send —
/api/v1/audiences/:id/send+ internal dashboard mirror filter audience, template, contacts all by tenant. - Dashboard audience send now posts to
/api/internal/audiences/:id/send(session-auth) instead of/api/v1/...(was 401 with cookies). - API keys UI surfaces tenant picker + allowed-domain multiselect on create; row badges show binding.
- Audit failures surface in logs —
logAuditnow internally callslogErroron DB failure so silent drops become visible even when the caller uses.catch(() => {}). - Idempotency store failure logging —
storeIdempotencycallsites swap.catch(() => {})forlogError("idempotency.store_failed", ...). - Domain cache invalidation —
invalidateDomainCache()called from add/verify/delete so tenant-binding changes don't linger behind the 5-min TTL. - v1 fanout moved to
after()— email/sms send + their batch siblings hand response back before endpoint delivery runs. - Dynamic audience insert chunking — chunks sized by serialized byte count (~0.5 MB budget) instead of row count so large HTML bodies don't push the Neon HTTP statement cap.
- Toast UX — dashboard toasts now surface
request_idwhen present and keep error toasts up longer so users can copy. - Sidebar grouping — nav organized into Sends / Configure / Admin sections instead of a flat 10-item list.
- Schema barrel documented as server-only to head off a client-bundle regression where
@/lib/db/schemadrags Drizzle into the browser. - Tenants list paginated via cursor;
/api/internal/tenants?cursor=…+ "Load more" button. - Cron send-scheduled indentation cleaned up.
Closed in earlier passes (April 2026 — session 4)
- IP allowlist per API key —
api_keys.allowed_cidrs; request IP (X-Forwarded-For / X-Real-IP) matched against CIDRs inwithApiAuthwith IPv4 + IPv6 support. 403IP_NOT_ALLOWEDon miss. - CSV import for contacts —
POST /api/internal/audiences/:id/import-csv(text/csv or JSON body). Header row selects columns, extras become contact metadata. Up to 10k rows per call, audited asaudience.csv_imported. - A/B subject variants on audience send —
subject_variants(2–5 strings) assigns each contact deterministically by hash. Variant letter lands inmessages.metadata.subject_variant. - Domain warmup ramp —
domains.warmup_started_at+warmup_days. Daily cap grows50 × 2^dayfor the configured window; exceeding returns 429WARMUP_LIMIT_EXCEEDED. - Per-device session tracking — new
user_sessionstable +sessionIdJWT claim. Revoked rows reject future requests via JWT callback./api/internal/sessionslists active devices; DELETE revokes one./overview/settings/securityDevices card surfaces the list. - Async org export —
org_exportsjob table,/api/cron/run-exportsevery 10 min,/api/internal/org/export/asyncenqueues; download endpoint streams complete jobs. - Retention crons — nightly
/api/cron/retentiontrimshealth_probes(>30d),audit_logs(>365d),daily_stats(>180d), deliveredwebhook_deliveries(>30d). - Public status page —
/statusrenders last 30 days fromhealth_probes; cron probes every 5 min. - Dashboard audience surface —
/overview/audiences+/api/internal/audiencesCRUD + inline CSV / send drawer. - Owner-gated UI audit — billing page now reads role via
requireSession, cancel-send buttons on message list/detail gated onuseCanWrite, templates test-send nowrequireWriteSession(rejects viewers). - Tests — vitest covers
schedule.ts(timezone conversion),cidr.ts(v4 + v6 matching),csv.ts(quoted fields, CRLF, escaped quotes). 81 tests total.
Closed in earlier passes (April 2026 — session 3)
- Non-blocking webhook fan-out — SES, SNS-SMS, and inbound-email handlers wrap
fanoutWebhookEventinafter(() => ...)so the HTTP response returns before fan-out runs. /api/healthendpoint — Postgres + Upstash probes, JSON body withstatus,checks,uptime_s. Returns 503 when any probe fails.- Stripe in-app cancel flow — billing portal session created with
flow_data.subscription_canceland current active subscription, surfaced on/overview/settings/billing. - GitHub Actions CI —
.github/workflows/ci.ymlrunstsc --noEmit,eslint, andvitest runon push + PR. - Per-key rate limits —
api_keys.rate_limit_per_minutecolumn (nullable; falls back to plan default). Enforced before plan bucket, emitsX-RateLimit-Scope: key|org. - Message search —
/api/internal/messagesaccepts?q=across to/from/subject (email) or to/from/body (SMS) with escaped-ilike; UI adds a debounced search box. - Audit log dashboard —
/overview/settings/auditwith filter by action, client-side text filter, cursor pagination. - Onboarding checklist — server-rendered on
/overviewbased on actual state (email verified, domain added, domain verified, key created, first send). - Template preview + test-send —
/api/internal/templates/previewrenders variables,/api/internal/templates/test-sendfires a real send taggedtemplate-test. Expandable UI per template on/overview/templates. - Deliverability card —
/overviewgroups current-period email sends by sender domain and surfaces bounce rate with warn/critical thresholds. - Hot-path logger wire-up — all cron + webhook handlers emit
log.infoon success andlogErroron failure. - Tests — vitest covers unsubscribe token round-trip/tampering, from-address parse/format, tracking link rewrite + pixel injection, scopes, idempotency body hash.
Closed in prior passes
- Active-org cookie honored by
requireSession();/api/internal/*observes the switcher. - Backup codes bcrypt-hashed; login runs
bcrypt.compareacross the hash array. - Password reset bumps
users.session_version; JWT carries the version and is invalidated on mismatch. - CORS preflight + headers set on
/api/v1/*fromproxy.ts. - Permissions model:
ownervsmember. Destructive / admin internal endpoints requireowner. - Cancel scheduled send:
DELETE /api/v1/{emails,sms}/{id}flipsscheduled→canceled. - Member removal + role update:
PATCH / DELETE /api/internal/team/members. - Audit: domain add/remove/verify, webhook create/remove/rotate, plan upgrade/downgrade all emit entries.
- Resend verification email:
POST /api/auth/resend-verification(rate-limited 3/hour via Upstash). - 2FA dashboard UI — setup (QR + verify), backup codes, disable w/ password.
- Templates dashboard UI.
- Webhook delivery inspection on
/overview/webhooks. - OpenAPI 3.1 spec at
/api/openapi.json. - Structured logger at
src/lib/log.ts. - Tracking scaffold:
applyTrackingrewrites links and injects pixel whentrack_opensortrack_clicksis set on a send.
Still open
Operational follow-ups for the new enterprise surfaces
- SAML test harness — IdP-side configuration testing (Okta + Azure AD) currently only documented; no automated end-to-end test for AuthnResponse parsing.
- DLT provider plurality — Gupshup is the only wired provider. Kaleyra / MSG91 / direct DLT-portal need their own
pollDltStatusshims when customers ask. - SES
@authenio/samlify-node-xmllint— optional schema validator. Production deploys should install it orverifyAndExtractthrows.
Data
- Neon HTTP driver caveat — multi-statement transactions only via
sql.transaction([...]).db.transaction(...)unavailable.
Verified clean
tsc --noEmitpasses with no errors after the current pass.vitest run: 262 pass / 4 skipped / 0 fail.- All Drizzle tables re-exported in the schema barrel.
- Migration 0009 applied to dev DB; journal backfilled for 0000-0009.
- CI job defined at
.github/workflows/ci.yml.