We work on surgical AI, VLMs, explainability, and reasoning-aware perception.
The first explainability-driven benchmark for Vision-Language Models in robotic surgery — Grad-CAM, causal graphs, and attention-alignment metrics reveal a gap between accuracy and reasoning in current surgical VLMs.