Skip to content
← All projects
active started 2026-04-07

ClinicOps Copilot

Agentic operations layer for a synthetic clinic.

Claude Sonnet 4.5 (via OpenRouter)FastAPIPostgreSQL + FHIR R4Synthea (synthetic patients)SQLite events storeStreamlitTerraform (AWS Lambda + RDS)GitHub ActionsuvPython 3.11

The problem

Most healthcare AI demos on GitHub show what a LangChain tutorial does in a notebook. They don’t show what shipping an agent into a clinical operations workflow actually requires:

  • Integrating with messy real-world systems (EHR, scheduling, billing, coverage)
  • Instrumenting every tool call so an ops team can audit what the AI did
  • Shipping a CLI someone’s IT team can actually run on their laptop
  • Proving correctness with an eval harness that runs on every PR
  • Handing over a one-command deploy to a cloud account

ClinicOps Copilot is an attempt at the full end-to-end version of that.

The approach

Three Claude agents operate over a synthetic FHIR R4 PostgreSQL database seeded by Synthea. A FastAPI gateway routes requests to the right agent. Every tool call streams to a SQLite events store. A Streamlit dashboard reads from the events store in real time so the ops team can see exactly what the AI is doing.

AgentStatusRoleTool calls
SchedulershippedBooks, reschedules, cancels appointments. Handles double-bookings, slot conflicts, provider availability.find_open_slots, book_appointment, cancel_appointment, lookup_patient
EligibilityshippedChecks insurance coverage status from FHIR Coverage resource. Flags expired plans, missing prior auth.lookup_coverage, check_active_period, get_payor_rules
TriageshippedRoutes new patient intents to the right downstream agent or human. Handles Spanish code-switching.classify_intent, route_to_agent, escalate_to_human

A fourth Billing/RCM agent is planned for Phase 2.

Architecture decisions worth defending

  • No LangChain, no LlamaIndex. Custom tool-use loop on the OpenAI Python SDK pointed at OpenRouter. Single provider, no provider-switching code paths, no abstraction tax.
  • FHIR R4 over a custom schema. Any real clinic already has FHIR. Meeting reality where it is matters more than schema cleverness.
  • SQLite events store, not a vendor observability platform. Local-first observability survives environments where Datadog and friends don’t reach.
  • Terraform module, not a Helm chart. Infra teams in clinical settings are more comfortable with Terraform. Optimize for their on-call rotation, not for engineering taste.
  • Eval harness on every PR. 20 golden test cases (booking conflicts, coverage edge cases, Spanish code-switching) are the contract. If an eval fails, the build fails. No clever LLM-as-judge, just deterministic pass/fail on tool call sequences.

Why this matters

The interesting work in clinical AI isn’t the model. It’s the infrastructure around the model: tool-use loops that work under failure, observability that survives air-gapped environments, evals that catch silent degradations. ClinicOps Copilot is my attempt at that whole stack, end to end, for a realistic clinical workflow.

Try it live

Try ClinicOps Copilot

Ask a natural-language question about a synthetic clinical operations dataset. The copilot classifies intent, generates SQL, runs it against a seeded SQLite, and summarizes the result.

live
Try:

Run it locally

Every project is open source. Clone the repo, install dependencies with uv sync, drop your API keys into .env, and run the CLI. The README walks you through each step.

git clone https://github.com/deepmind11/clinic-ops-copilot.git
cd clinic-ops-copilot
uv sync
cp .env.example .env  # add your keys
uv run python -m clinic_ops_copilot --help

Domain

  • Healthcare operations
  • FHIR R4
  • Clinical AI deployment

Tech stack

Claude Sonnet 4.5 (via OpenRouter)FastAPIPostgreSQL + FHIR R4Synthea (synthetic patients)SQLite events storeStreamlitTerraform (AWS Lambda + RDS)GitHub ActionsuvPython 3.11