📊 Sprint 4.1 · M1 + M4 Production Deepening · 5-25 Apr 2027
← Hub ← Phase 4 SPRINT 4.1 · DATA-DRIVEN 3 minggu

Sprint 4.1 · M1 + M4 Production Deepening

First post-production sprint. We have 4 months of real-world data from pilot + production. Now close the gaps the data revealed: triage 90→95%, SOAP override 12→8%, CPG corpus expansion, doctor-preference layer.

1. 🎯 Sprint Summary

Sprint4.1 (M1 + M4 Production Deepening)
Duration5 - 25 Apr 2027 (3 minggu · 15 working days)
ModulesM1 PSPA · M4 DRPA — refinement only · no new module surface
GoalTriage accuracy 90→95% · SOAP override 12→8% · CPG corpus +500 passages · doctor preference layer · BM rojak corpus +500 phrases · clinical override insight dashboard
Capacity5 FTE (2 BE + 1 FE + 1 prompt eng + 0.5 DevOps) + 0.5 Founder + 1.0 Doc Zam (heavy clinical)
Velocity target75 SP (15 days × 5 FTE × ~1 SP/day)
Demo date25 Apr 2027 · sprint demo + 4.2 prep

2. 📊 Production Data Audit (4 months)

Triage data (M1): 18,400 classifications · 92% Doc-Zam-HITL accuracy · 8% errors mostly in BM-rojak with English drug names · 30% of misses are pediatric fever <5y where AI under-triages
SOAP data (M4): 4,200 encounters · 12% override rate · most overrides on "Plan" section (not Subjective/Objective) · top reasons: doctor adds patient-specific instruction, doctor changes Rx, doctor adds follow-up clause
CPG citation gaps: 8% of SOAPs cite CPG sections that don't exist in our pgvector corpus (drift) · 12% of doctor-overrides cite a CPG passage we don't have
BM rojak corpus: 23% of patient inputs use medical-term-in-English-with-BM-grammar pattern · current corpus catches 78% · target 95%
Doctor preference signal: Clear pattern — Dr A always adds "kembali jika fever 3 hari" follow-up · Dr B always specifies dose timing · these are personalisable

3. 🚦 Pre-Sprint Gate Checklist

  • 30-day production stability watch passed (1-31 Mar)
  • Production data audit report shared with team · 5+ pages
  • CPG corpus expansion source identified (MOH-CPG · DynaMed sandbox · UpToDate sample)
  • Doc Zam clinical case panel scheduled (4 panels through sprint)
  • Eval framework set up · 200 historical encounters as gold-set · re-classified by Doc Zam
  • Feature flag for doctor-preference layer staged
  • Production support rota covers sprint (no team member > 30% on support)

4. 🧩 Sprint Scope

  • Triage classifier improvements: Pediatric fever sub-classifier · BM-rojak EN-medical-term parser · confidence floor recalibration
  • BM rojak corpus expansion: +500 phrases · backfill from 4-month chat logs (PII-stripped)
  • CPG corpus expansion: +500 passages · auto-detect citation drift · alert dashboard
  • SOAP override-rate reducer: Plan-section sub-prompt · doctor-prefer follow-up phrasing · Rx editor friction reduction
  • Doctor preference layer: Per-doctor learned patterns · auto-apply Dr A's "kembali fever 3 hari" template · per-doctor sign-off speed
  • Clinical override insight dashboard: Filament page · override-by-doctor · override-by-section · override-by-CPG · trends
  • Eval harness: 200-encounter gold-set re-runs every nightly · regression alerts

5. 📅 Day-by-Day Plan (15 days)

D1Mon 5 Apr · Kickoff + Eval Harness
Sprint planning · 200-encounter gold set committed · eval harness CI integration · baseline numbers locked.
D2Tue 6 Apr · Triage Failure Mode Analysis
Doc Zam + Prompt Eng · review 100 misclassified cases · categorise · build fix-plan.
D3Wed 7 Apr · Pediatric Sub-Classifier
Sub-prompt for pediatric <5y fever · age-aware blocklist expansion · re-eval gold set.
D4Thu 8 Apr · BM Rojak Corpus Expansion
+250 phrases from chat logs · re-evaluate · second 250 phrases scheduled.
D5Fri 9 Apr · Mid-Demo + Course Correct
Show triage improvements live · Doc Zam validation · gold-set acc check.
D6Mon 12 Apr · Confidence Floor Recalibration
Confidence floor sweep · ROC analysis · Doc Zam picks operating point · deploy.
D7Tue 13 Apr · CPG Corpus Expansion (250)
First 250 missing CPG passages indexed · citation drift dashboard built.
D8Wed 14 Apr · SOAP Override Analysis
Doc Zam + Prompt Eng · review 50 overrides · sub-prompt fix targets identified.
D9Thu 15 Apr · SOAP Plan-Section Sub-Prompt
New Plan-section prompt · doctor-style follow-up phrasing · re-eval on 50 cases.
D10Fri 16 Apr · Mid-Demo Round 2
Show SOAP override improvement · Doc Zam panel review of 20 fresh cases.
D11Mon 19 Apr · Doctor Preference Layer (BE)
Schema · per-doctor preference store · auto-apply pipeline · feature flag gated.
D12Tue 20 Apr · Doctor Preference Layer (FE+UX)
UI for "edit my preferences" · learned-pattern review · accept/reject UI.
D13Wed 21 Apr · Override Insight Dashboard
Filament page · override trends · drilldown · per-doctor + per-tenant filters.
D14Thu 22 Apr · Hardening + Production Rollout
Feature flags flipped on per-tenant · Doc Zam tenant first · monitor 24h.
D15Fri 23 Apr · Demo Prep + Polish
Demo deck · gold-set final numbers · staging-prod parity verified.
+Mon 25 Apr · Sprint Demo + Retro
9am demo · 11am retro · 2pm 4.2 kickoff prep.

6. 📦 Deliverables

FRItemSP
FR-4.1.1Eval harness · 200-encounter gold set · CI integration5
FR-4.1.2Pediatric <5y sub-classifier5
FR-4.1.3BM rojak corpus +5005
FR-4.1.4Confidence floor recalibration3
FR-4.1.5CPG corpus +500 + drift dashboard8
FR-4.1.6SOAP Plan-section sub-prompt5
FR-4.1.7Rx editor friction reduction3
FR-4.1.8Doctor preference schema + apply8
FR-4.1.9Doctor preference UI5
FR-4.1.10Override insight dashboard8
FR-4.1.11Production rollout · per-tenant flag5
FR-4.1.12Doc Zam case panel reviews × 48
FR-4.1.13Pen-test light review (regression check)3
FR-4.1.14Sprint retro + 4.2 prep2
TOTAL73 SP

7. 👥 Team Capacity

RoleAllocationFocus
Eng Lead / BE1.0 FTEDoctor preference + override dashboard
BE Dev 21.0 FTECPG corpus pipeline + eval harness
FE Dev1.0 FTEOverride dashboard + preference UI
Prompt Eng1.0 FTEBM rojak + classifier sub-prompts + SOAP Plan
Founder0.5 FTEArchitecture · clinical alignment · phase coordination
Doc Zam1.0 FTE (heavy)4 case panels · daily review · gold-set re-classification
QA0.5 FTEEval harness · regression · prod rollout verification
DevOps0.5 FTEProduction support + corpus pipeline ops

8. 🔔 Sprint Ceremonies

  • Mon 5 Apr 9am — Sprint Planning (90 min)
  • Daily 9am — Standup (15 min · Doc Zam joins Tue/Thu)
  • Fri 9 Apr + Fri 16 Apr 4pm — Mid-sprint demos (45 min each)
  • Tue 6 + Wed 14 + Mon 19 + Thu 22 Apr — Doc Zam clinical case panels (90 min each)
  • Mon 25 Apr 9am — Sprint Demo (60 min)
  • Mon 25 Apr 11am — Sprint Retro (60 min)

9. 🩺 Doc Zam Sign-off Items

  • Production data audit interpretations clinically valid
  • Pediatric fever sub-classifier — false-negative rate ≤ 5% on 30 fresh cases
  • BM rojak corpus expansion — sample 50 phrases · clinical accuracy
  • CPG corpus +500 — sample 100 passages · clinically representative
  • SOAP Plan-section improvements — 50 case re-eval < 8% override
  • Doctor preference layer — clinically appropriate · doesn't bias diagnosis
  • Override insight dashboard — surfaces clinically meaningful patterns
  • Final demo (25 Apr) — written sign-off in repo

10. 🎬 Demo Agenda — 25 Apr 9am (60 min)

TimeSegment
0-5Recap · production data audit · v1.1 narrative
5-15Triage improvements · gold-set numbers (90→95%)
15-25SOAP override reduction · live encounter walk-through
25-35CPG corpus expansion · drift dashboard
35-45Doctor preference layer · Doc Zam personalised SOAP
45-55Override insight dashboard · cross-tenant trends
55-60Doc Zam sign-off · 4.2 (Wearables) kickoff prep

11. 🛡️ Contingency

RiskTriggerResponse
Triage acc improvement plateauGold set < 93% by D7Tighten sub-prompts · model upgrade if needed · accept 93% as v1.1 milestone
CPG corpus licensingUpToDate sample deniedStick with MOH-CPG public · expand internally
Doctor preference UI rejectedDoc Zam concern about biasStrict guardrails · only Plan-section style · never Assessment
Production support spike> 40% capacity drainedSlip 2 deliverables to 4.2 · maintain quality
Eval harness flakyInconsistent runsPin model version · seed-fix · isolate variance source