University Archives
Poster Presentation
College of Engineering & Science
Oladipo, Eyiara, and Avishek Mukherjee. "Visual Assistance for Visually Impaired Users Using Multimodal AI."
Large language models have advanced rapidly in recent years, but they still struggle with visual understanding, reasoning grounded in real-world environments, and interaction with the physical world. This project addresses these limitations by integrating real-time video segmentation with an audio interface, enabling users to communicate with the system and receive live visual feedback. The system is aimed to assist visually impaired and elderly individuals by acting as an intelligent guide that can identify objects and provide contextual support in real time. Key features include object recognition and scene understanding, a memory module for storing and retrieving object locations to aid in item recovery, and depth-based obstacle detection with real-time alerts. The system leverages Meta's Segment Anything Model 3 (SAM3) for visual segmentation and Google Gemini 2.5 Lite for higher-level reasoning and scene interpretation. Performance evaluation focuses on both technical metrics such as targeting a minimum frame rate of 30 FPS as well as usability for the intended user population.
Browse Faculty and Student Publications, Presentations, Honors, and Awards
Published Conference Proceedings
