Team LIVE Wins Computer Vision for Public Safety Challenge
WNCG students Marius Facktor and Abhinau Venkataramanan and WNCG alumnus Praful Gupta from Prof. Al Bovik’s Laboratory for Image & Video Engineering (LIVE) have been named Phase 2 winners in the Enhancing Computer Vision for Public Safety Challenge.
The challenge is hosted by the Public Safety Communications Research Division of the National Institute of Standards and Technology. The competition aims to support public safety missions by advancing computer vision algorithms and no-reference (NR) metrics that assess image or video quality.
“First responders use cameras and computer vision to save people's lives,” explains Marius Facktor, one of the students from Team LIVE. Search-and-rescue missions increasingly use this technology; one example includes robots that can locate survivors in a burning building.
When it comes to computer vision, the best results come from high-quality images. However, real-world situations are often far from ideal. Direct sunlight, rain, motion blur, or other issues can distort images, disrupting computer vision and reducing the reliability of the tools that use it.
“Put garbage in; get garbage out, essentially,“ Facktor remarked. “So if that robot in a burning building says there are no people, should you trust it? Or could it have been mistaken because of the smoke?”
It’s a question Facktor and his teammates were well-equipped to consider, as members of the LIVE Lab. Prof. Bovik and his lab group have been researching no-reference quality assessment for years. In other words, how can the quality of an image be evaluated in the absence of a reference image for comparison? While much of the research the LIVE Lab has conducted has been focused on human perception, the Enhancing Computer Vision for Public Safety Challenge looks at this same problem from the lens of computer vision.
The challenge consists of two parts: In Phase 1, teams pitch a concept to demonstrate capabilities, knowledge, and skills, as well as detailing a proposed solution and approach. From these entries, up to 10 teams move forward to the next round. In Phase 2, teams have six months to collect image datasets with various natural impairments. Using those images, the teams run a computer vision algorithm and report the failure rate for each image in the set.
Facktor, Venkataramanan, and Gupta created a dataset of dashcam images taken while driving around Austin. The team considered several impairment categories, including direct sunlight, nighttime, oncoming headlights, rain, out-of-focus, oily lens, and snow. They chose to test their images with YOLOv3—an object-detection algorithm that can identify various vehicles, signage, people, and other common roadside items.
A dashcam image containing motion blur, one of the impairment categories Team LIVE studied.
Photo courtesy Marius Facktor.
The group first tested YOLO against a training set to score how well the algorithm performed against the images. They then turned to creating their NR model; Venkataramanan computed an average saliency map and used log Gabor filters to extract features from each image. The group then used a regression tool with the training set to map between image features and YOLO performance scores.
“The idea was that images with similar impairments would also have similar features and also perform similarly in YOLO,” Facktor stated. “So, we used these extracted features to predict YOLO failure rate on all the other images in our dataset.”
Team LIVE was one of five recipients of the contest’s Phase 2 Award. They also received a CDVL Distribution Prize for providing their dataset to the Consumer Digital Video Library for further research and development.
“I truly hope our dataset will be useful for some researcher in the future,” Facktor remarked, “It would be incredibly rewarding if this helps to improve the abilities of first responders.”