OdontoX App Performance — Neon Recommended Driver Split + TanStack Query

Date: 2026-05-03 Owner: sshssn Status: Spec — awaiting user review

Problem

The app is “terribly slow” across the board: first load, navigation between modules, lists, saves, and — notably — re-entering a module the user just left. Investigation found two root causes:

Server: wrong Neon driver everywhere. getDatabase() in server/src/lib/db.ts is used by ~50+ routes. On Cloudflare Workers it always opens a fresh @neondatabase/serverless Pool (WebSocket) to Neon in ap-southeast-1 (Singapore), per request. Neon’s HTTP driver (getDatabaseHttp) exists but is consumed by exactly one route (user-devices.ts). Every read pays a TLS+WebSocket handshake to Singapore that a single HTTP fetch could replace.
Client: no persistent data cache. No @tanstack/react-query, no swr, no QueryClient. There is a hand-rolled in-memory cache.ts with 90s/5min stale-while-revalidate, but it is allowlist-scoped (~10 endpoints), wiped on reload, not shared across tabs, and does not power most pages. Returning to a module re-fetches everything.

Cloudflare Hyperdrive was considered and explicitly deferred — the user prefers Neon’s recommended path first.

Goals

Server roundtrip latency: drop ~70%+ on read-heavy endpoints by removing the WebSocket handshake from the hot path.
Client perceived latency: re-entering a module renders cached data instantly, with background revalidation.
Zero auth/security regressions. Zero cross-tenant data leakage in the persisted cache.
Migration is gradual and per-route revertible. No big-bang.

Non-goals

Cloudflare Hyperdrive (deferred — separate spec if revisited).
Edge KV response caching (deferred — Phase 3 was dropped from current scope).
Database schema changes, index tuning, or N+1 query fixes (separate work, may follow).
Mobile app (Expo) — uses native fetch, separate stack.
Durable Object (CLINIC_HUB) SSE refactor.
Auth flow (Firebase, JWT refresh, WebAuthn) changes.

Architecture overview

Three independent phases, each shippable on its own:

Phase 1 ─ Server: Neon driver split
   getDatabase()     →  getReadDb()  (HTTP, default for GETs and one-shot writes)
                     →  getWriteDb() (Pool, only for interactive transactions, with waitUntil cleanup)

Phase 2 ─ Client: TanStack Query + persistent cache
   <PersistQueryClientProvider> wraps <App />
   QueryClient { staleTime: 30s, gcTime: 24h, buster: build-hash }
   localStorage persister, clinicId in every query key

Phase 4 ─ Bundle/CSS audit
   Tailwind purge correctness, lazy-load heavy chunks, dead-icon scrub

(Phase 3 — KV edge cache — deferred. Numbering preserved to match the brainstorming discussion.)

Phase 1 — Neon driver split (server)

Driver decision matrix (from Neon’s current docs, May 2026)

Use case	Driver	Why
One-shot read (most GETs)	`neon(url)` HTTP	Single fetch round-trip, no WS handshake, lowest cold-path latency
One-shot write (single INSERT/UPDATE/DELETE)	`neon(url)` HTTP	Same — works for non-transactional writes
Multi-statement atomic batch (no branching logic)	`neon(url).transaction([...])`	One HTTP round-trip, atomic, no interactive control
Interactive transaction (BEGIN, conditional, COMMIT)	`Pool` (WebSocket)	Only path that supports `client.query('BEGIN')` flows

Changes to `server/src/lib/db.ts`

Replace the current getDatabase() + getDatabaseHttp() shape with three explicit accessors:

getReadDb(connStr?) → returns drizzle(neon(url)) (HTTP). Default for all GET handlers and any single-statement mutation.
getWriteDb(connStr?) → for now, alias of getReadDb (HTTP works for single-statement writes too). Distinct name documents intent and gives us a hook if we ever need a write-only knob.
getTxDb(connStr?, ctx) → returns a Pool-backed Drizzle instance plus an end() helper the route is required to register with ctx.waitUntil(). Used only by routes that need interactive transactions.

The existing getDatabase() is kept as a thin alias of getReadDb() during migration so unmigrated routes keep working. Removed once all callsites are migrated. The AsyncLocalStorage per-request connection cache is removed — HTTP driver has no connection state to cache, and a Pool can’t safely outlive a single request handler on Workers anyway.

Migration order

Add new accessors to db.ts.
Migrate read-heavy routes first (highest impact, lowest risk):
- routes/clinic.ts (settings, stats)
- routes/patients.ts (list, get, search)
- routes/appointments.ts (list, day view)
- routes/billing.ts, routes/invoices.ts (read paths)
- routes/staff.ts, routes/services.ts, routes/inventory.ts
Migrate single-statement writes module by module.
Audit every db.transaction( callsite and every BEGIN raw SQL. Migrate those to getTxDb() + ctx.waitUntil(end()). Likely candidates: payment processing, appointment-conflict booking, invoice creation, inventory stock adjustments. Each gets its own commit so a regression rollback is one revert.

What this won’t break (auth)

Auth endpoints are mostly single statements: login reads user + password hash; refresh reads a refresh-token row and writes a new one; logout deletes one row; session check is a single SELECT. All migrate cleanly to HTTP. JWT signing, cookie handling, Firebase Admin, WebAuthn challenges — none touch the driver layer.

Risks

Risk	Mitigation
HTTP driver used where a transaction was needed → split-write inconsistency	Pre-migration grep for `db.transaction(`, `BEGIN`, `FOR UPDATE`. Each match audited and routed to `getTxDb` if interactive
Pool routes leak (no `pool.end()`) → Worker hangs / billing	`getTxDb()` returns `{ db, end }`. Route MUST call `ctx.waitUntil(end())`. Reviewer checks for this in every Tx-route diff
Drizzle behavior subtly different across `neon-http` vs `neon-serverless` adapters	Spot-test each migrated route in staging before promoting. Both adapters share the same Drizzle query builder API; differences are at the driver edge only

Phase 2 — TanStack Query (client)

Packages

@tanstack/react-query (v5)
@tanstack/react-query-persist-client
@tanstack/query-sync-storage-persister

`QueryClient` defaults

staleTime: 30_000              // 30s — re-renders within 30s served from cache, no refetch
gcTime:    24 * 60 * 60_000    // 24h — match maxAge so persistence isn't garbage-collected
retry:     1
refetchOnWindowFocus: false    // disabled — too noisy for clinical workflows

Persister

createSyncStoragePersister({
  storage: window.localStorage,
  key: `odontox-rq-${activeClinicId}`,    // tenant-scoped storage key
})

persistOptions: {
  persister,
  maxAge: 24 * 60 * 60_000,                // 24h — match gcTime
  buster: __BUILD_HASH__,                  // injected at build time → auto-invalidate on deploy
}

The build-hash buster is the deploy-safety net: any production deploy invalidates every user’s persisted cache, eliminating the “stale data after release” class of bug.

Cache key conventions

Every query key is a tuple starting with [resource, clinicId, ...params]:

['patients', clinicId, { search, page }]
['appointments', clinicId, { date }]
['invoice', clinicId, invoiceId]

clinicId is read from localStorage['odontox-active-clinic-id'] via a small hook. Querying without a clinicId is a programming error; the hook throws in dev, no-ops in prod.

Auth-safety lifecycle

Event	Action
Login success	`queryClient.clear()` — wipe any leftover cache from a previous session
Logout	`queryClient.clear()` + `localStorage.removeItem('odontox-rq-*')` — privacy on shared devices
Clinic switch	Full reload happens already (intentional). Persister key includes `clinicId`, so the new clinic loads its own bucket. On reload, also remove stale buckets older than 24h to keep localStorage tidy
401 from any query	`onError` global handler → existing redirect-to-login flow (unchanged)
Mutation success	`invalidateQueries({ queryKey: [resource, clinicId] })` — refetch tenant-scoped views

Coexistence with existing `cache.ts`

TanStack Query is added alongside the existing hand-rolled cache. No big-bang migration. Per-module rollout:

Patients module → migrate first (highest traffic, simplest CRUD shape)
Appointments
Billing / invoices
Inventory
Settings / staff / services
Remaining

For each module: existing fetch wrapper stays for non-migrated callsites; migrated components switch to useQuery / useMutation. The hand-rolled cache.ts allowlist shrinks as routes migrate. After full migration, cache.ts is deleted.

What this won’t break

Auth endpoints are not wrapped in useQuery — they use direct fetch (signIn, signOut, refreshToken). No change.
Server-Sent Events (CLINIC_HUB Durable Object stream) does not go through Query. No change.
File uploads (R2, DICOM) — useMutation wraps these but the underlying upload code is untouched.
Existing cache.ts allowlist routes — coexist until migrated.

Risks

Risk	Mitigation
Persisted cache shows previous clinic’s data	`clinicId` in every query key + per-clinic localStorage key
Persisted cache shows previous user’s data on shared devices	`queryClient.clear()` on login AND logout; clear localStorage on logout
Deploy ships breaking change while users have stale persisted cache	`buster` = build hash → auto-invalidates on every deploy
localStorage quota exceeded on long sessions	`gcTime` and `maxAge` both 24h; persister silently drops oldest entries
User sees stale data right after a write	`useMutation.onSuccess` invalidates relevant query keys

Phase 4 — Bundle/CSS audit

Smaller win, separate scope. Three targets:

CSS: 490KB CSS bundle is suspicious for a Tailwind project. Verify tailwind.config.ts content glob actually purges. Spot-check dist/assets/*.css for unused class prefixes.
Lazy-load heavy chunks: confirm @react-pdf/renderer, dicom-parser, ChartJS (if present) are all behind React.lazy() or dynamic imports. Anything bundled into the entry chunk that’s only used on one page is a target.
Icon imports: spot-check pages for import { ... } from 'lucide-react' patterns that pull the whole pack. Tree-shaking should handle this, but a misconfigured import can defeat it.

Output: a short report of findings + targeted fixes. No spec sub-design needed.

Testing strategy

Phase 1: existing route tests pass. Add a per-route smoke test that confirms it returns 200 with expected shape after migration. Manual staging test for transaction routes.
Phase 2: snapshot test for QueryClient config; integration test for clinic-switch cache isolation (login as user with 2 clinics, populate cache, switch, assert no leak); manual test of login/logout cache clear.
Phase 4: bundle-size diff before/after. Lighthouse score before/after on /dashboard, /patients, /appointments.

Rollback

Phase 1: revert single route’s commit. getDatabase() alias still points to the HTTP path; old behavior is recoverable by flipping the alias back to the Pool path.
Phase 2: feature flag the <PersistQueryClientProvider> wrapper. Disabling reverts to the hand-rolled cache for migrated routes (they fall back to direct fetch via the same serverComm wrappers Query is built on).
Phase 4: standard build-config revert.

Out-of-scope follow-ups

Hyperdrive evaluation (separate spec when prioritized).
KV edge response caching (Phase 3 deferred).
N+1 query audit, especially in tenant-scoped list endpoints.
DB index audit on hot columns (patients.clinic_id, appointments.scheduled_at, etc.).
Mobile app caching parity.

​OdontoX App Performance — Neon Recommended Driver Split + TanStack Query

​Problem

​Goals

​Non-goals

​Architecture overview

​Phase 1 — Neon driver split (server)

​Driver decision matrix (from Neon’s current docs, May 2026)

​Changes to server/src/lib/db.ts

​Migration order

​What this won’t break (auth)

​Risks

​Phase 2 — TanStack Query (client)

​Packages

​QueryClient defaults

​Persister

​Cache key conventions

​Auth-safety lifecycle

​Coexistence with existing cache.ts

​What this won’t break

​Risks

​Phase 4 — Bundle/CSS audit

​Testing strategy

​Rollback

​Out-of-scope follow-ups

OdontoX App Performance — Neon Recommended Driver Split + TanStack Query

Problem

Goals

Non-goals

Architecture overview

Phase 1 — Neon driver split (server)

Driver decision matrix (from Neon’s current docs, May 2026)

Changes to `server/src/lib/db.ts`

Migration order

What this won’t break (auth)

Risks

Phase 2 — TanStack Query (client)

Packages

`QueryClient` defaults

Persister

Cache key conventions

Auth-safety lifecycle

Coexistence with existing `cache.ts`

What this won’t break

Risks

Phase 4 — Bundle/CSS audit

Testing strategy

Rollback

Out-of-scope follow-ups