
LoRA (r=16) fine-tuning pipeline on 54 synthetic SOAP note pairs | Notebook: notebooks/med-gemma-4b-soap-lora.ipynb
Input Transcript
45-year-old male, new onset T2DM, A1c 9.1%. BMI 34. Fasting glucose 210. Reports polydipsia, polyuria x3 weeks. Family history of diabetes (mother, sister). No prior medications.
Output (unstructured)
The patient is a 45 year old male who presents with new onset diabetes. His A1c is 9.1% which is elevated. BMI is 34 indicating obesity. He has symptoms of polydipsia and polyuria for 3 weeks. I would recommend starting metformin and lifestyle modifications. Follow up in 3 months to recheck A1c. Consider referral to endocrinology if not improving. He should also be screened for complications including eye exam and foot exam.
Same Input Transcript
45-year-old male, new onset T2DM, A1c 9.1%. BMI 34. Fasting glucose 210. Reports polydipsia, polyuria x3 weeks. Family history of diabetes (mother, sister). No prior medications.
Output (structured SOAP)
SUBJECTIVE: 45-year-old male presenting with new-onset polyuria and polydipsia x3 weeks. FHx: T2DM in mother and sister. No prior medications. Reports increased thirst and frequent urination. OBJECTIVE: A1c 9.1% (elevated). Fasting glucose 210 mg/dL. BMI 34 (obese). Vitals otherwise stable. ASSESSMENT: Type 2 diabetes mellitus, newly diagnosed with hyperglycemia. ICD-10: E11.65 - Type 2 diabetes mellitus with hyperglycemia. PLAN: 1. Start metformin 500mg BID, titrate to 1000mg BID over 2 weeks. 2. Diabetic education: diet, exercise, glucose monitoring. 3. Ophthalmology referral for baseline diabetic eye exam. 4. Comprehensive metabolic panel, lipid panel, urine microalbumin. 5. Recheck A1c in 3 months. Target <7%. 6. Return precautions: DKA symptoms (nausea, vomiting, abdominal pain).
| Metric | Base MedGemma 4B | Fine-tuned | Delta |
|---|---|---|---|
| SOAP completeness (4/4 sections) | 6/10 | 10/10 | +67% |
| ICD-10 exact code match | 4/10 | 9/10 | +125% |
| Structured output consistency | 3/10 | 10/10 | +233% |
| Drug name extraction | 5/10 | 9/10 | +80% |
Base Model
MedGemma 4B IT
Method
LoRA (r=16, a=32)
Target Modules
q/k/v/o_proj
Trainable Params
~0.5%
Training Data
54 SOAP pairs
Epochs
3
Learning Rate
2e-4
Precision
BF16 + NF4