MedScribe AI

Project Overview

MedScribe AI is a multi-agent clinical pipeline engineered entirely locally, harnessing the power of Google Health AI Developer Foundations (HAI-DEF) models. It translates raw, unstructured clinical interactions—from raw audio dictations to direct multimodal image payloads—into robust, structured, FHIR R4-compliant interoperable medical records and nuanced SOAP notes.

Built for the Kaggle MedGemma Impact Challenge, this architecture completely eschews massive, latency-heavy closed-source APIs in favor of highly optimized, localized tensor operations securely parsing patient data directly on hardware.

Core AI Agents & Flow

1. MedASR

Runs rapid ASR transcription on native clinical dictation with high vocabulary accuracy specifically tuned for medical nomenclature.

2. MedSigLIP

Acts as the visual gatekeeper. Zero-shot classifies attached medical imagery into major specialties (radiology, dermatology, etc.) instantly.

3. MedGemma 4B IT

The cognitive engine. It fuses the visual findings and the transcript context to generate structured ICD-10 codes and SOAP notes.

4. TxGemma

The safety rail. Monitors the generated treatment plan against established pharmaceutical guidelines to prevent drug-drug interactions.

Developer

Mayank

steeltroops.vercel.app

Engineered the complete agentic architecture, FastAPI routing framework, model serving optimization, and the frontend glassmorphism dashboard.