AI-Powered Dermatological Assistant: Bridging Healthcare Gaps Through Multimodal Intelligence
Millions lack access to specialized dermatological care due to geographic and technological disparities. We present a novel multimodal framework that combines image-based diagnosis with a visual-question answering pipeline, powered by DINOv2 and a compressed LLaVA model. Our system supports accurate skin disease diagnosis and explanation, optimized for low-resource settings.
This project introduces a clinical-grade Visual Language Model (VLM) capable of dermatological diagnosis using natural language prompts and images. Our AI assistant is trained via four stages: auxiliary classification, medical reasoning, interaction optimization, and resource-efficient deployment through structured pruning. The final model achieves 82.05% diagnostic accuracy and a 9/10 patient interaction score, even when operating within <4.5GB of memory.
Key contributions:
- Integration of DINOv2 and LLaVA for robust image-text understanding.
- Domain-specific fine-tuning and question-answering for medical settings.
- Progressive enhancement through reasoning, DPO, and pruning.
- Local and global impact potential—especially in under-resourced areas.
📄 View Poster Below:
