Skip to main content

DICOM AI Quota, Safety & Observability

Date: 2026-04-27
Status: Approved

Overview

Add hard usage quotas to DICOM AI analysis, a first-time beta terms modal, a persistent AI disclaimer, patient history entries on analysis completion, and a superadmin usage monitor. Controls cost, manages liability, and gives operators visibility into per-clinic AI consumption.

1. Data Model

1.1 New table: app.dicom_quota

clinic_id          text  — indexed, no FK (integrity via authenticated session)
period_start       text  — "YYYY-MM-DD", start of current billing period (PK part)
monthly_studies    integer  default 0
daily_studies      integer  default 0
day_key            text  — "YYYY-MM-DD", tracks which day daily_studies covers
tokens_input       integer  default 0
tokens_output      integer  default 0
updated_at         timestamp
Composite PK: (clinic_id, period_start)
One row per clinic per billing period. Auto-created on first analysis of a new period.
Daily reset logic (inline, no cron):
On every quota load, if day_key !== today (UTC):
UPDATE dicom_quota
SET daily_studies = 0, day_key = today, updated_at = now
WHERE clinic_id = ? AND period_start = ?
Reset happens atomically on the first request of a new day. No background job. Billing period calculation (from clinic_modules.created_at for moduleKey = 'dicom_imaging'):
anchor_day = day_of_month(module.created_at)   -- e.g. 15
current_period_start = most recent date where day_of_month = anchor_day, on or before today
next_reset = current_period_start + 1 month
Edge case: anchor_day 29–31 in short months → clamp to last day of that month.

1.2 New column on app.clinic_modules

dicom_ai_terms_accepted_at   timestamp  nullable
Added to the row where moduleKey = 'dicom_imaging'. Null = terms not yet accepted. Set once, never cleared.

2. Limits

LimitValueScope
Monthly analyses500Per clinic, per billing period
Daily analyses20Per clinic, rolling UTC day
Monthly tokens (input + output)5,000,000Per clinic, per billing period
Concurrent analyses3Per clinic, in-flight at once
Max DICOM slices per call30Per request
max_completion_tokens1000Per AI call (down from 8192)
temperature0.2Per AI call (was 0.1)

3. Server: Quota Enforcement

3.1 Route: POST /api/v1/protected/ai/dicom-analysis

Updated request body adds optional field:
{ patientFileId, imageBase64, modality?, bodyPart?, sliceCount?: number }
sliceCount defaults to 1. Reject with 400 if > 30. Enforcement order:
1. Compute period_start from clinic_modules.created_at anchor
2. Load (or create) dicom_quota row; apply daily reset if day_key ≠ today
3. Check monthly_studies >= 500         → 429 { reason: "monthly_studies" }
4. Check (tokens_input + tokens_output) >= 5_000_000 → 429 { reason: "monthly_tokens" }
5. Check daily_studies >= 20            → 429 { reason: "daily_studies" }
6. Check sliceCount > 30                → 400 { reason: "slice_limit" }
7. Acquire concurrency slot via clinic-hub DO
   → if activeAnalyses >= 3            → 429 { reason: "concurrent" }
8. Call analyzeDicomImage() with max_completion_tokens: 1000, temperature: 0.2
9. On success: UPDATE dicom_quota — increment monthly_studies, daily_studies,
   tokens_input += result.usage.input_tokens,
   tokens_output += result.usage.output_tokens
10. Release concurrency slot (always, in finally block)
429 response shape:
{
  "error": "quota_exceeded",
  "reason": "monthly_studies | monthly_tokens | daily_studies | concurrent",
  "limit": 500,
  "used": 500,
  "resetsAt": "2026-05-15T00:00:00Z"
}
resetsAt is ISO 8601 UTC. For daily_studies, it’s midnight tonight UTC. For monthly limits, it’s the next billing anchor date. For concurrent, it is omitted.

3.2 New route: GET /api/v1/protected/ai/dicom-quota

Returns current period usage for the authenticated clinic:
{
  "monthly_studies": 312,
  "monthly_limit": 500,
  "daily_studies": 8,
  "daily_limit": 20,
  "tokens_input": 2_100_000,
  "tokens_output": 480_000,
  "tokens_limit": 5_000_000,
  "period_start": "2026-04-15",
  "next_reset": "2026-05-15",
  "terms_accepted": true
}

3.3 New route: POST /api/v1/protected/ai/dicom-terms-accept

Sets dicom_ai_terms_accepted_at = now() on the clinic_modules row for dicom_imaging. Idempotent — if already set, returns 200 with no change.

4. Durable Object: Concurrency Lock

4.1 Changes to ClinicHub

Two new RPC methods. No new fetch() routes, no alarms, no ctx.storage writes.
private activeAnalyses = 0;
private slotTimestamps = new Map<string, number>(); // requestId → acquired_at ms

acquireAnalysisSlot(requestId: string): { granted: boolean; active: number } {
  this.sweepStaleSlots(); // lazy cleanup — only runs when someone tries to acquire
  if (this.activeAnalyses >= 3) return { granted: false, active: this.activeAnalyses };
  this.activeAnalyses++;
  this.slotTimestamps.set(requestId, Date.now());
  return { granted: true, active: this.activeAnalyses };
}

releaseAnalysisSlot(requestId: string): void {
  if (this.slotTimestamps.has(requestId)) {
    this.slotTimestamps.delete(requestId);
    this.activeAnalyses = Math.max(0, this.activeAnalyses - 1);
  }
}

private sweepStaleSlots(): void {
  const cutoff = Date.now() - 5 * 60 * 1000; // 5 minutes
  for (const [id, ts] of this.slotTimestamps) {
    if (ts < cutoff) {
      this.slotTimestamps.delete(id);
      this.activeAnalyses = Math.max(0, this.activeAnalyses - 1);
    }
  }
}
Design constraints:
  • activeAnalyses is in-memory only — no ctx.storage calls, no I/O on every analysis
  • If the DO hibernates, counter resets to 0, which is correct (no active WebSockets = no in-flight analyses)
  • Stale slot sweep is lazy — runs only inside acquireAnalysisSlot, never on a timer or alarm
  • No new wake-up paths introduced — the DO fires only when an analysis starts or ends

5. UI Changes

5.1 Usage counter — DicomPage.tsx header

Fetches GET /api/v1/protected/ai/dicom-quota once on mount (no polling). Renders a compact bar in the workstation header:
AI Analyses  ████████░░  312 / 500  ·  Resets 15 May
             8 / 20 today
Color states based on monthly_studies / 500:
  • < 80%: green
  • 80–95%: amber
  • ≥ 95%: red
Clicking the bar expands a small popover showing token usage breakdown.

5.2 First-time terms modal — DicomViewer.tsx

Triggered when “Analyse with AI” is clicked and terms_accepted === false (from cached quota response). Analysis does not start until confirmed. Modal content (warm, not alarming):
Ruby AI Analysis — Beta You’re about to use AI-powered radiograph analysis. A few things to know:
  • This feature is in beta — results are thorough but always need clinical review
  • Ruby highlights findings, but a licensed dentist must make all clinical decisions
  • Each analysis uses your clinic’s monthly quota (500 studies/period)
  • Completed reports are saved to the patient file for reference
[I understand — start analysis] [Cancel]
On confirm: calls POST /api/v1/protected/ai/dicom-terms-accept, then proceeds to analysis. Updates local quota state so modal never shows again this session. Cache-busts the quota fetch so the header reflects terms_accepted: true.

5.3 Persistent AI disclaimer — DicomViewer.tsx results panel

Non-dismissable banner always rendered at the top of the AI results panel whenever results are visible. Not a toast.
⚠  AI-generated analysis — A licensed dentist must review all findings
   and make all clinical decisions. This is not a clinical diagnosis.
Small text, amber tint, consistent with medical software conventions.

5.4 Quota error messages — DicomViewer.tsx

When the POST returns 429, show a persistent amber banner (not a toast) with friendly copy:
reasonMessage
monthly_studies”Your clinic has used all 500 AI analyses for this period. Resets on [resetsAt date].”
monthly_tokens”Your clinic’s monthly AI token budget is exhausted. Resets on [resetsAt date].”
daily_studies”Daily limit reached — 20 analyses per day keeps costs fair for everyone. Back to full speed tomorrow.”
concurrent”An analysis is already running. Please wait a few seconds and try again.”
slice_limit”This study has too many frames to analyse at once. Please select a smaller slice range (max 30).”
resetsAt is parsed from the 429 body and formatted as a human-readable date (e.g., “15 May”).

5.5 Patient history tab — PatientDetails.tsx

After a successful analysis, a history entry is written to the patient timeline in the History tab (existing TabsContent value="history" at line 768). Entry format:
[Brain icon]  DICOM AI Analysis
              Ruby analysed [filename] · [modality] · [urgencyLevel] urgency
              [timestamp]   [View report →]
Implementation: after the POST /api/v1/protected/ai/dicom-analysis succeeds on the client, append the entry to whatever data source the history tab reads from (to be confirmed during implementation — check existing history data pattern). If the history tab uses a separate API, emit a cache-bust or refetch trigger after analysis completes.

6. Superadmin: DICOM Usage Monitor

6.1 New component: ui/src/components/superadmin/DicomUsageMonitor.tsx

New tab or section in the superadmin panel. Fetches from: GET /api/v1/protected/superadmin/dicom-usage Returns all clinics’ current-period quota rows joined with clinic name:
[
  {
    "clinic_id": "...",
    "clinic_name": "Bright Smiles Dental",
    "period_start": "2026-04-15",
    "next_reset": "2026-05-15",
    "monthly_studies": 312,
    "daily_studies": 8,
    "tokens_input": 2_100_000,
    "tokens_output": 480_000,
    "terms_accepted_at": "2026-04-16T09:23:00Z"
  }
]
UI: sortable table with columns — Clinic, Studies Used, Daily Used, Tokens Used, Period, Next Reset. Color-code usage bars same as DicomPage (green/amber/red). No write actions needed.

6.2 Langfuse

Use Langfuse for: cost-per-run breakdowns, model performance, trace debugging, prompt version analysis. The superadmin table is for quota health and abuse detection; Langfuse is for cost forensics.

7. Files Touched

FileChange
server/src/schema/dicom_quota.tsNew schema file
server/src/schema/clinic_modules.tsAdd dicomAiTermsAcceptedAt column
server/src/schema/index.tsExport new schema
server/src/lib/schema-ensure.tsRuntime migration for new column + table
server/src/routes/ai.tsQuota enforcement on POST, new GET quota route, new POST terms-accept route
server/src/routes/superadmin.tsNew GET dicom-usage route
server/src/lib/ai/agents/dicom-analysis.tsUpdate max_completion_tokens to 1000, temperature to 0.2, accept requestId for slot tracking
server/src/durable-objects/clinic-hub.tsAdd acquireAnalysisSlot / releaseAnalysisSlot RPC methods
ui/src/components/files/DicomPage.tsxAdd usage counter bar
ui/src/components/files/DicomViewer.tsxAdd terms modal, persistent disclaimer, quota error banners
ui/src/components/patients/PatientDetails.tsxHistory tab: render DICOM analysis entries
ui/src/components/superadmin/DicomUsageMonitor.tsxNew superadmin monitor component

8. Out of Scope

  • Multi-slice / batch analysis (slice cap enforced but multi-select UI not added here)
  • Token-level streaming (current single-POST pattern unchanged)
  • Quota override or manual limit bumps (superadmin can adjust via DB for now)
  • Automated Langfuse alerts (configured separately in Langfuse dashboard)