Giving Mobile Cameras Real World Vision

Tuesday, March 01, 2016

Human vision is an intricate phenomenon. Hundreds of billions of neurons in your brain form an image processing machine more powerful than any in existence today. Half of your brain is devoted to processing images you see so you can navigate the world, see through distortions, adapt to variations in lighting and see in three dimensions, all while walking, running and driving.

“People can instantly recognize distortions in images and their severity, while also seeing through distortions to accomplish depth, detection and recognition tasks,” WNCG Prof. Alan Bovik states.

Smartphones and digital cameras have few of these capabilities. Smartphone pictures are often grainy or otherwise distorted from conditions such as low light, noise, blur, saturation, over or underexposure and the compression needed to store and send an image. These distortions cause advanced computer vision algorithms to fail - meaning they can no longer accurately recreate the 3D world or find and recognize objects within it.

Correcting the viewer’s perception of image errors in a smartphone is an important step towards the design of computer vision algorithms in future devices that will help users better navigate and recognize the world around them.

“There has been a lot of working on having smart phones take better pictures, but our work goes further than that,” Prof. Bovik states. “We are interested in having your cell phone, handheld camera, or a camera on the outside of your car use computer vision so you could better navigate a city, find and recognize faces in a large crowd under difficult conditions, and avoid vehicle collisions.”

To help move cameras and computer vision into the world of the future, the NSF recently awarded Prof. Bovik a $450,000 grant. With help from his graduate student, Deepti Ghadiyaram, the WNCG team will tackle challenges in computer vision, distortion detection and algorithm design to bring cameras one step closer to the processing power of the human brain and vision system.

“The most sophisticated learning engines, such as those used at Google with many layers of adaptation, have less than one millionth of the neural connectivity of the visual portions of the brain, which is like a single blade compared to an acre of grass,” Prof. Bovik states. “We can’t model or implement that kind of learning power.”

By using the little we do know, Prof. Bovik continues, we plan to make our cameras, phones, internet, televisions, robots, cinematic and other viewing experiences much better.

The WNCG team will lay the foundation for future smart cameras that can recognize 3D street scenes and objects by creating algorithms that can detect, analyze and respond to a wide range of distortions and can estimate 3D depth. Even under difficult conditions caused by low-light, moving subjects or unsteady hands of users, these advances would produce clear images less affected by environmental distortions and distractions.

Since they have the potential to improve the capabilities of other camera devices as well, these advances wouldn’t just benefit consumers. Advanced computer algorithms that can penetrate distortions could also impact surveillance and security cameras, mobile telemedicine camera devices and military cameras operating under difficult battlefield conditions.