1. 🎯 Sprint Summary
| Sprint | 4.1 (M1 + M4 Production Deepening) |
| Duration | 5 - 25 Apr 2027 (3 minggu · 15 working days) |
| Modules | M1 PSPA · M4 DRPA — refinement only · no new module surface |
| Goal | Triage accuracy 90→95% · SOAP override 12→8% · CPG corpus +500 passages · doctor preference layer · BM rojak corpus +500 phrases · clinical override insight dashboard |
| Capacity | 5 FTE (2 BE + 1 FE + 1 prompt eng + 0.5 DevOps) + 0.5 Founder + 1.0 Doc Zam (heavy clinical) |
| Velocity target | 75 SP (15 days × 5 FTE × ~1 SP/day) |
| Demo date | 25 Apr 2027 · sprint demo + 4.2 prep |
2. 📊 Production Data Audit (4 months)
Triage data (M1): 18,400 classifications · 92% Doc-Zam-HITL accuracy · 8% errors mostly in BM-rojak with English drug names · 30% of misses are pediatric fever <5y where AI under-triages
SOAP data (M4): 4,200 encounters · 12% override rate · most overrides on "Plan" section (not Subjective/Objective) · top reasons: doctor adds patient-specific instruction, doctor changes Rx, doctor adds follow-up clause
CPG citation gaps: 8% of SOAPs cite CPG sections that don't exist in our pgvector corpus (drift) · 12% of doctor-overrides cite a CPG passage we don't have
BM rojak corpus: 23% of patient inputs use medical-term-in-English-with-BM-grammar pattern · current corpus catches 78% · target 95%
Doctor preference signal: Clear pattern — Dr A always adds "kembali jika fever 3 hari" follow-up · Dr B always specifies dose timing · these are personalisable
3. 🚦 Pre-Sprint Gate Checklist
- 30-day production stability watch passed (1-31 Mar)
- Production data audit report shared with team · 5+ pages
- CPG corpus expansion source identified (MOH-CPG · DynaMed sandbox · UpToDate sample)
- Doc Zam clinical case panel scheduled (4 panels through sprint)
- Eval framework set up · 200 historical encounters as gold-set · re-classified by Doc Zam
- Feature flag for doctor-preference layer staged
- Production support rota covers sprint (no team member > 30% on support)
4. 🧩 Sprint Scope
- Triage classifier improvements: Pediatric fever sub-classifier · BM-rojak EN-medical-term parser · confidence floor recalibration
- BM rojak corpus expansion: +500 phrases · backfill from 4-month chat logs (PII-stripped)
- CPG corpus expansion: +500 passages · auto-detect citation drift · alert dashboard
- SOAP override-rate reducer: Plan-section sub-prompt · doctor-prefer follow-up phrasing · Rx editor friction reduction
- Doctor preference layer: Per-doctor learned patterns · auto-apply Dr A's "kembali fever 3 hari" template · per-doctor sign-off speed
- Clinical override insight dashboard: Filament page · override-by-doctor · override-by-section · override-by-CPG · trends
- Eval harness: 200-encounter gold-set re-runs every nightly · regression alerts
5. 📅 Day-by-Day Plan (15 days)
D1Mon 5 Apr · Kickoff + Eval Harness
Sprint planning · 200-encounter gold set committed · eval harness CI integration · baseline numbers locked.
Sprint planning · 200-encounter gold set committed · eval harness CI integration · baseline numbers locked.
D2Tue 6 Apr · Triage Failure Mode Analysis
Doc Zam + Prompt Eng · review 100 misclassified cases · categorise · build fix-plan.
Doc Zam + Prompt Eng · review 100 misclassified cases · categorise · build fix-plan.
D3Wed 7 Apr · Pediatric Sub-Classifier
Sub-prompt for pediatric <5y fever · age-aware blocklist expansion · re-eval gold set.
Sub-prompt for pediatric <5y fever · age-aware blocklist expansion · re-eval gold set.
D4Thu 8 Apr · BM Rojak Corpus Expansion
+250 phrases from chat logs · re-evaluate · second 250 phrases scheduled.
+250 phrases from chat logs · re-evaluate · second 250 phrases scheduled.
D5Fri 9 Apr · Mid-Demo + Course Correct
Show triage improvements live · Doc Zam validation · gold-set acc check.
Show triage improvements live · Doc Zam validation · gold-set acc check.
D6Mon 12 Apr · Confidence Floor Recalibration
Confidence floor sweep · ROC analysis · Doc Zam picks operating point · deploy.
Confidence floor sweep · ROC analysis · Doc Zam picks operating point · deploy.
D7Tue 13 Apr · CPG Corpus Expansion (250)
First 250 missing CPG passages indexed · citation drift dashboard built.
First 250 missing CPG passages indexed · citation drift dashboard built.
D8Wed 14 Apr · SOAP Override Analysis
Doc Zam + Prompt Eng · review 50 overrides · sub-prompt fix targets identified.
Doc Zam + Prompt Eng · review 50 overrides · sub-prompt fix targets identified.
D9Thu 15 Apr · SOAP Plan-Section Sub-Prompt
New Plan-section prompt · doctor-style follow-up phrasing · re-eval on 50 cases.
New Plan-section prompt · doctor-style follow-up phrasing · re-eval on 50 cases.
D10Fri 16 Apr · Mid-Demo Round 2
Show SOAP override improvement · Doc Zam panel review of 20 fresh cases.
Show SOAP override improvement · Doc Zam panel review of 20 fresh cases.
D11Mon 19 Apr · Doctor Preference Layer (BE)
Schema · per-doctor preference store · auto-apply pipeline · feature flag gated.
Schema · per-doctor preference store · auto-apply pipeline · feature flag gated.
D12Tue 20 Apr · Doctor Preference Layer (FE+UX)
UI for "edit my preferences" · learned-pattern review · accept/reject UI.
UI for "edit my preferences" · learned-pattern review · accept/reject UI.
D13Wed 21 Apr · Override Insight Dashboard
Filament page · override trends · drilldown · per-doctor + per-tenant filters.
Filament page · override trends · drilldown · per-doctor + per-tenant filters.
D14Thu 22 Apr · Hardening + Production Rollout
Feature flags flipped on per-tenant · Doc Zam tenant first · monitor 24h.
Feature flags flipped on per-tenant · Doc Zam tenant first · monitor 24h.
D15Fri 23 Apr · Demo Prep + Polish
Demo deck · gold-set final numbers · staging-prod parity verified.
Demo deck · gold-set final numbers · staging-prod parity verified.
+Mon 25 Apr · Sprint Demo + Retro
9am demo · 11am retro · 2pm 4.2 kickoff prep.
9am demo · 11am retro · 2pm 4.2 kickoff prep.
6. 📦 Deliverables
| FR | Item | SP |
|---|---|---|
| FR-4.1.1 | Eval harness · 200-encounter gold set · CI integration | 5 |
| FR-4.1.2 | Pediatric <5y sub-classifier | 5 |
| FR-4.1.3 | BM rojak corpus +500 | 5 |
| FR-4.1.4 | Confidence floor recalibration | 3 |
| FR-4.1.5 | CPG corpus +500 + drift dashboard | 8 |
| FR-4.1.6 | SOAP Plan-section sub-prompt | 5 |
| FR-4.1.7 | Rx editor friction reduction | 3 |
| FR-4.1.8 | Doctor preference schema + apply | 8 |
| FR-4.1.9 | Doctor preference UI | 5 |
| FR-4.1.10 | Override insight dashboard | 8 |
| FR-4.1.11 | Production rollout · per-tenant flag | 5 |
| FR-4.1.12 | Doc Zam case panel reviews × 4 | 8 |
| FR-4.1.13 | Pen-test light review (regression check) | 3 |
| FR-4.1.14 | Sprint retro + 4.2 prep | 2 |
| TOTAL | 73 SP |
7. 👥 Team Capacity
| Role | Allocation | Focus |
|---|---|---|
| Eng Lead / BE | 1.0 FTE | Doctor preference + override dashboard |
| BE Dev 2 | 1.0 FTE | CPG corpus pipeline + eval harness |
| FE Dev | 1.0 FTE | Override dashboard + preference UI |
| Prompt Eng | 1.0 FTE | BM rojak + classifier sub-prompts + SOAP Plan |
| Founder | 0.5 FTE | Architecture · clinical alignment · phase coordination |
| Doc Zam | 1.0 FTE (heavy) | 4 case panels · daily review · gold-set re-classification |
| QA | 0.5 FTE | Eval harness · regression · prod rollout verification |
| DevOps | 0.5 FTE | Production support + corpus pipeline ops |
8. 🔔 Sprint Ceremonies
- Mon 5 Apr 9am — Sprint Planning (90 min)
- Daily 9am — Standup (15 min · Doc Zam joins Tue/Thu)
- Fri 9 Apr + Fri 16 Apr 4pm — Mid-sprint demos (45 min each)
- Tue 6 + Wed 14 + Mon 19 + Thu 22 Apr — Doc Zam clinical case panels (90 min each)
- Mon 25 Apr 9am — Sprint Demo (60 min)
- Mon 25 Apr 11am — Sprint Retro (60 min)
9. 🩺 Doc Zam Sign-off Items
- Production data audit interpretations clinically valid
- Pediatric fever sub-classifier — false-negative rate ≤ 5% on 30 fresh cases
- BM rojak corpus expansion — sample 50 phrases · clinical accuracy
- CPG corpus +500 — sample 100 passages · clinically representative
- SOAP Plan-section improvements — 50 case re-eval < 8% override
- Doctor preference layer — clinically appropriate · doesn't bias diagnosis
- Override insight dashboard — surfaces clinically meaningful patterns
- Final demo (25 Apr) — written sign-off in repo
10. 🎬 Demo Agenda — 25 Apr 9am (60 min)
| Time | Segment |
|---|---|
| 0-5 | Recap · production data audit · v1.1 narrative |
| 5-15 | Triage improvements · gold-set numbers (90→95%) |
| 15-25 | SOAP override reduction · live encounter walk-through |
| 25-35 | CPG corpus expansion · drift dashboard |
| 35-45 | Doctor preference layer · Doc Zam personalised SOAP |
| 45-55 | Override insight dashboard · cross-tenant trends |
| 55-60 | Doc Zam sign-off · 4.2 (Wearables) kickoff prep |
11. 🛡️ Contingency
| Risk | Trigger | Response |
|---|---|---|
| Triage acc improvement plateau | Gold set < 93% by D7 | Tighten sub-prompts · model upgrade if needed · accept 93% as v1.1 milestone |
| CPG corpus licensing | UpToDate sample denied | Stick with MOH-CPG public · expand internally |
| Doctor preference UI rejected | Doc Zam concern about bias | Strict guardrails · only Plan-section style · never Assessment |
| Production support spike | > 40% capacity drained | Slip 2 deliverables to 4.2 · maintain quality |
| Eval harness flaky | Inconsistent runs | Pin model version · seed-fix · isolate variance source |