Subscribe free to our newsletters via your
. 24/7 Space News .




TECH SPACE
Collecting just the right data
by Staff Writers
Boston MA (SPX) Jul 30, 2014


Calculating the mutual information between two nodes in a graph is like injecting blue dye into one of them and measuring the concentration of blue at the other. Crucial to the new algorithm are the elimination of loops in the graph (orange) and a technique that prevents intermediary nodes (black) from distorting the long-range calculation of mutual information (blue). Image courtesy Jose-Luis Olivares/MIT.

Much artificial-intelligence research addresses the problem of making predictions based on large data sets. An obvious example is the recommendation engines at retail sites like Amazon and Netflix.

But some types of data are harder to collect than online click histories -information about geological formations thousands of feet underground, for instance. And in other applications - such as trying to predict the path of a storm - there may just not be enough time to crunch all the available data.

Dan Levine, an MIT graduate student in aeronautics and astronautics, and his advisor, Jonathan How, the Richard Cockburn Maclaurin Professor of Aeronautics and Astronautics, have developed a new technique that could help with both problems.

For a range of common applications in which data is either difficult to collect or too time-consuming to process, the technique can identify the subset of data items that will yield the most reliable predictions.

So geologists trying to assess the extent of underground petroleum deposits, or meteorologists trying to forecast the weather, can make do with just a few, targeted measurements, saving time and money.

Levine and How, who presented their work at the Uncertainty in Artificial Intelligence conference this week, consider the special case in which something about the relationships between data items is known in advance.

Weather prediction provides an intuitive example: Measurements of temperature, pressure, and wind velocity at one location tend to be good indicators of measurements at adjacent locations, or of measurements at the same location a short time later, but the correlation grows weaker the farther out you move either geographically or chronologically.

Graphic Content
Such correlations can be represented by something called a probabilistic graphical model. In this context, a graph is a mathematical abstraction consisting of nodes - typically depicted as circles - and edges - typically depicted as line segments connecting nodes.

A network diagram is one example of a graph; a family tree is another. In a probabilistic graphical model, the nodes represent variables, and the edges represent the strength of the correlations between them.

Levine and How developed an algorithm that can efficiently calculate just how much information any node in the graph gives you about any other - what in information theory is called "mutual information."

As Levine explains, one of the obstacles to performing that calculation efficiently is the presence of "loops" in the graph, or nodes that are connected by more than one path.

Calculating mutual information between nodes, Levine says, is kind of like injecting blue dye into one of them and then measuring the concentration of blue at the other. "It's typically going to fall off as we go further out in the graph," Levine says.

"If there's a unique path between them, then we can compute it pretty easily, because we know what path the blue dye will take. But if there are loops in the graph, then it's harder for us to compute how blue other nodes are because there are many different paths."

So the first step in the researchers' technique is to calculate "spanning trees" for the graph.

A tree is just a graph with no loops: In a family tree, for instance, a loop might mean that someone was both parent and sibling to the same person. A spanning tree is a tree that touches all of a graph's nodes but dispenses with the edges that create loops.

Betting the Spread
Most of the nodes that remain in the graph, however, are "nuisances," meaning that they don't contain much useful information about the node of interest.

The key to Levine and How's technique is a way to use those nodes to navigate the graph without letting their short-range influence distort the long-range calculation of mutual information.

That's possible, Levine explains, because the probabilities represented by the graph are Gaussian, meaning that they follow the bell curve familiar as the model of, for instance, the dispersion of characteristics in a population.

A Gaussian distribution is exhaustively characterized by just two measurements: the average value - say, the average height in a population - and the variance - the rate at which the bell spreads out.

"The uncertainty in the problem is really a function of the spread of the distribution," Levine says. "It doesn't really depend on where the distribution is centered in space." As a consequence, it's often possible to calculate variance across a probabilistic graphical model without relying on the specific values of the nodes. "The usefulness of data can be assessed before the data itself becomes available," Levine says.

.


Related Links
Massachusetts Institute of Technology
Space Technology News - Applications and Research






Comment on this article via your Facebook, Yahoo, AOL, Hotmail login.

Share this article via these popular social media networks
del.icio.usdel.icio.us DiggDigg RedditReddit GoogleGoogle




Memory Foam Mattress Review
Newsletters :: SpaceDaily :: SpaceWar :: TerraDaily :: Energy Daily
XML Feeds :: Space News :: Earth News :: War News :: Solar Energy News





TECH SPACE
A new multi-bit 'spin' for MRAM storage
Washington DC (SPX) Jul 23, 2014
Interest in magnetic random access memory (MRAM) is escalating, thanks to demand for fast, low-cost, nonvolatile, low-consumption, secure memory devices. MRAM, which relies on manipulating the magnetization of materials for data storage rather than electronic charges, boasts all of these advantages as an emerging technology, but so far it hasn't been able to match flash memory in terms of storag ... read more


TECH SPACE
China's biggest moon challenge: returning to earth

Lunar Pits Could Shelter Astronauts, Reveal Details of How 'Man in the Moon' Formed

Manned mission to Moon scheduled by Roscosmos for 2020-2031

Landsat Looks to the Moon

TECH SPACE
NASA Seeks Proposals for Commercial Mars Data Relay Satellites

Emirates paves way for Middle East space program with mission to Mars

Curiosity's images show Earth-like soils on Mars

India could return to Mars as early as 2017

TECH SPACE
NASA Awards Construction Contract at Kennedy Space Center

Sierra Nevada Completes Major Dream Chaser NASA CCiCap Milestone

NASA Partners Punctuate Summer with Spacecraft Development Advances

Voyager Spacecraft Might Not Have Reached Interstellar Space

TECH SPACE
China to launch HD observation satellite this year

Lunar rock collisions behind Yutu damage

China's Fast Track To Circumlunar Mission

Chinese moon rover designer shooting for Mars

TECH SPACE
Russian Cargo Craft Launches for 6-Hour Trek to ISS

ISS Crew Opens Cargo Ship Hatch, Preps for CubeSat Deployment

Russian cargo craft docks with ISS, science satellite fails

End dawns for Europe's space cargo delivery role

TECH SPACE
China to launch satellite for Venezuela

SpaceX Soft Lands Falcon 9 Rocket First Stage

SpaceX releases video of rocket splashing into the ocean

SpaceX Falcon 9 v1.1 Flights Deemed Successful

TECH SPACE
'Challenges' in quest to find water on Earth-like worlds: study

Transiting Exoplanet with Longest Known Year

Brown Dwarfs May Wreak Havoc on Orbits of Nearby Planets

NASA Mission To Reap Bonanza of Earth-sized Planets

TECH SPACE
Building 'invisible' materials with light

Laser experiment reveals liquid-like motion of atoms in an ultra-cold cluster

Amazon launches 3D printing store

Carbyne morphs when stretched




The content herein, unless otherwise known to be public domain, are Copyright 1995-2014 - Space Media Network. All websites are published in Australia and are solely subject to Australian law and governed by Fair Use principals for news reporting and research purposes. AFP, UPI and IANS news wire stories are copyright Agence France-Presse, United Press International and Indo-Asia News Service. ESA news reports are copyright European Space Agency. All NASA sourced material is public domain. Additional copyrights may apply in whole or part to other bona fide parties. Advertising does not imply endorsement, agreement or approval of any opinions, statements or information provided by Space Media Network on any Web page published or hosted by Space Media Network. Privacy Statement All images and articles appearing on Space Media Network have been edited or digitally altered in some way. Any requests to remove copyright material will be acted upon in a timely and appropriate manner. Any attempt to extort money from Space Media Network will be ignored and reported to Australian Law Enforcement Agencies as a potential case of financial fraud involving the use of a telephonic carriage device or postal service.