Camin - AI Visual Assistant

PWA designed to help visually impaired individuals by acting as a digital visual assistant. It mixes web technologies and AI to provide real-time auditory and haptic feedback about the user's surroundings. It identifies obstacles during navigation and provides detailed descriptions.

Live Demo View Source

The Problem

Visually impaired individuals often rely on traditional tools like white canes or guide dogs, which have limitations in describing the nature of obstacles or the details of a scene. Navigating unfamiliar environments independently involves safety risks, and identifying specific objects or reading text requires sighted assistance, reducing personal autonomy.

The Solution

The solution is a mobile-first web application with two core modes: Active Route (Real-time Navigation): Uses TensorFlow.js running entirely in the browser to detect objects (cars, people, obstacles) at 10+ frames per second. Calculates approximate distance and position (left/right/center) of obstacles. Provides immediate feedback via Text-to-Speech (TTS) and Haptic Vibration, varying intensity based on proximity to ensure safety without latency. Scene Analysis (Deep Understanding): Captures high-resolution images and processes them via Supabase Edge Functions. Utilizes Google's Gemini 2.5 Flash model to generate rich, descriptive text of the scene, reading text, identifying colors, and describing context. The description is read aloud to the user via the Web Speech API. https://camin.lovable.app/

My Role

Full Stack Developer

Tech Stack

Next.jsTailwind CSSReact.jsTypeScriptViteTensorFlowGoogle Gemini 2.5 FlashSupabase Web Speech APILovable

Key Decisions

1
Accessibility-first design: The UI is designed with high-contrast visuals and large touch targets, but primarily optimized for screen readers and voice feedback. The "Active Route" mode works even with the screen off or in a pocket (audio/haptic only focus).
2
Hybrid AI architecture: I chose to run object detection on the client (Edge AI) using TensorFlow.js to eliminate network latency, which is critical for safety alerts. Conversely, I offloaded the heavy "Scene Analysis" to the cloud (Gemini) where latency is acceptable in exchange for high accuracy and detail.
3
Progressive Web App (PWA): Instead of a native app, I built it as a PWA to ensure cross-platform compatibility and easy distribution, while still accessing native hardware features like the camera and vibration motor.
4
Privacy-by-design: Real-time processing happens locally on the user's device. Images sent to the cloud for analysis are processed statelessly and not stored, respecting user privacy.

Screenshot Gallery