by Staff Writers
Providence RI (SPX) Mar 13, 2015
Researchers from Brown and Johns Hopkins universities have come up with a new way to evaluate how well computers can divine information from images. The team describes its new system as a "visual Turing test," after the legendary computer scientist Alan Turing's test of the extent to which computers display human-like intelligence.
"There have been some impressive advances in computer vision in recent years," said Stuart Geman, the James Manning Professor of Applied Mathematics at Brown. "We felt that it might be time to raise the bar in terms of how these systems are evaluated and benchmarked."
Traditional computer vision benchmarks tend to measure an algorithm's performance in detecting objects within an image (the image has a tree, or a car or a person), or how well a system identifies an image's global attributes (scene is outdoors or in the nighttime).
"We think it's time to think about how to do something deeper -- something more at the level of human understanding of an image," Geman said.
For example, it's one thing to be able to recognize that an image contains two people. But to be able to recognize that the image depicts two people walking together and having a conversation is a much deeper understanding. Similarly, describing an image as depicting a person entering a building is a richer understanding than saying it contains a person and a building.
The system Geman and his colleagues developed, described this week in the Proceedings of the National Academy of Sciences, is designed to test for such a contextual understanding of photos.
It works by generating a string of yes or no questions about an image, which are posed sequentially to the system being tested. Each question is progressively more in-depth and based on the responses to the questions that have come before.
For example, an initial question might ask a computer if there's a person in a given region of a photo. If the computer says yes, then the test might ask if there's anything else in that region -- perhaps another person. If there are two people, the test might ask: "Are person1 and person2 talking?"
As a group, the questions are geared toward gauging the computer's understanding of the contextual "storyline" of the photo. "You can build this notion of a storyline about an image by the order in which the questions are explored," Geman said.
Because the questions are computer-generated, the system is more objective than having a human simply query a computer about an image. There is a role for a human operator, however. The human's role is to tell the test system when a question is unanswerable because of the ambiguities of the photo.
For instance, asking the computer if a person in a photo is carrying something is unanswerable if most of the person's body is hidden by another object. The human operator would flag that question as ambiguous.
The first version of the test was generated based on a set of photos depicting urban street scenes. But the concept could conceivably be expanded to all kinds of photos, the researchers say.
Geman and his colleagues hope that this new test might spur computer vision researchers to explore new ways of teaching computers how to look at images. Most current computer vision algorithms are taught how to look at images using training sets in which objects are annotated by humans.
y looking at millions of annotated images, the algorithms eventually learn how to identify objects. But it would be very difficult to develop a training set with all the possible contextual attributes of a photo annotated. So true context understanding may require a new machine learning technique.
"As researchers, we tend to 'teach to the test,'" Geman said. "If there are certain contests that everybody's entering and those are the measures of success, then that's what we focus on. So it might be wise to change the test, to put it just out of reach of current vision systems."
Space Technology News - Applications and Research
|The content herein, unless otherwise known to be public domain, are Copyright 1995-2014 - Space Media Network. All websites are published in Australia and are solely subject to Australian law and governed by Fair Use principals for news reporting and research purposes. AFP, UPI and IANS news wire stories are copyright Agence France-Presse, United Press International and Indo-Asia News Service. ESA news reports are copyright European Space Agency. All NASA sourced material is public domain. Additional copyrights may apply in whole or part to other bona fide parties. Advertising does not imply endorsement, agreement or approval of any opinions, statements or information provided by Space Media Network on any Web page published or hosted by Space Media Network. Privacy Statement All images and articles appearing on Space Media Network have been edited or digitally altered in some way. Any requests to remove copyright material will be acted upon in a timely and appropriate manner. Any attempt to extort money from Space Media Network will be ignored and reported to Australian Law Enforcement Agencies as a potential case of financial fraud involving the use of a telephonic carriage device or postal service.|