Help Blind People See with Their Ears and Fingers

Google has developed an application that translates what a camera (such as one in a smartphone) “sees” into descriptions for blind people. Voice for Android is a free downloadable application Google introduced in 2016.

The vOICe for Android application adds a sonic augmented reality overlay to the live camera view in real-time, thereby giving even totally blind people live and detailed information about their visual environment that they would otherwise not perceive. It may also serve as an interactive mobile learning app for teaching visual concepts to blind children. The vOICe for Android is a universal translator for mapping images to sounds. The application can run on a smartphone, tablet, or suitably equipped pair of smart-glasses.

Once the application is started, it continuously takes snapshots. Each snapshot is translated into sounds via a polyphonic left-to-right scan through the snapshot while associating height with pitch and brightness with loudness. For example, a bright rising line on a dark background sounds as a rising pitch sweep, and a small bright spot sounds as a short beep. The visual encoded have a visual resolution of up to 176 x 64 pixels (which is greater than an implant with 10,000 electrodes).

Talking compass speaks the current heading. By default, it only speaks heading changes, but options for continuous heading feedback and for disabling the talking compass are available. Together with the camera-based soundscapes the talking compass may help to walk in a straight line, and of course the compass continues to work in the dark where the camera fails to give useful feedback.

Talking locator announces nearby street names and intersections as determined from GPS or local cell towers. It can upon request tell you the current speed and altitude, and the user can change verbosity.

Talking face detector announces the number of human faces detected by the camera. It can detect and announce up to dozens of faces in a single view. On the other hand, if only one face is detected then it will additionally say whether the face is located near the top, bottom, left, right, or center of the view, as well as announce when the face is within close-up range. It can also be set up to notify about skin color. The face detector is not a face recognition system, so there are no privacy concerns. Moreover, the talking face detector can be turned off.

Haptic feedback is offered that allows the user to feel the live camera view using the touch screen. The perceptual effect is quite crude and limited by the simplicity of the phone’s vibrator.

