Subscribe free to our newsletters via your
. 24/7 Space News .




EARTH OBSERVATION
How hard is it to 'de-anonymize' cellphone data?
by Larry Hardesty, MIT News Office
Boston MA (SPX) Mar 28, 2013


Rendering by Christine Daniloff/MIT of an original image by Yves-Alexandre de Montjoye et al.

The proliferation of sensor-studded cellphones could lead to a wealth of data with socially useful applications - in urban planning, epidemiology, operations research and emergency preparedness, among other things. Of course, before being released to researchers, the data would have to be stripped of identifying information. But how hard could it be to protect the identity of one unnamed cellphone user in a data set of hundreds of thousands or even millions?

According to a paper appearing this week in Scientific Reports, harder than you might think. Researchers at MIT and the Universite Catholique de Louvain, in Belgium, analyzed data on 1.5 million cellphone users in a small European country over a span of 15 months and found that just four points of reference, with fairly low spatial and temporal resolution, was enough to uniquely identify 95 percent of them.

In other words, to extract the complete location information for a single person from an "anonymized" data set of more than a million people, all you would need to do is place him or her within a couple of hundred yards of a cellphone transmitter, sometime over the course of an hour, four times in one year. A few Twitter posts would probably provide all the information you needed, if they contained specific information about the person's whereabouts.

The first author on the paper is Yves-Alexandre de Montjoye, a graduate student in the research group of Toshiba Professor of Media Arts and Science Sandy Pentland. He's joined by Cesar Hidalgo, an assistant professor of media arts and science; Vincent Blondel, a visiting professor at MIT and a professor of applied mathematics at Universite Catholique; and Michel Verleysen, a professor of electrical engineering at Universite Catholique.

Focusing the debate
Hidalgo's group specializes in applying the tools of statistical physics to a wide range of subjects, from communications networks to genetics to economics. In this case, he and de Montjoye were able to use those tools to uncover a simple mathematical relationship between the resolution of spatiotemporal data and the likelihood of identifying a member of a data set.

According to their formula, the probability of identifying someone goes down if the resolution of the measurements decreases, but less than you might think. Reporting the time of each measurement as imprecisely as sometime within a 15-hour span, or location as imprecisely as somewhere amid 15 adjacent cell towers, would still enable the unique identification of half the people in the sample data set.

But while its initial application may be discouraging, de Montjoye and Hidalgo hope that their formula will provide a way for researchers and policy analysts to reason more rigorously about the privacy safeguards that need to be put in place when they're working with aggregated location data.

"Both Cesar and I deeply believe that we all have a lot to gain from this data being used," de Montjoye says. "This formula is something that could be useful to help the debate and decide, OK, how do we balance things out, and how do we make it a fair deal for everyone to use this data?"

Everybody's different
In the data set that the researchers analyzed, the location of a cellphone was inferred solely from that of the cell tower it was connected to, and the time of the connection was given as falling within a one-hour interval. Each cellphone had a unique, randomly generated identifying number, so that its movement could be traced over time. But there was no information connecting that number to the phone's owner.

The researchers randomly selected a representative sampling from the set of 1.5 million cellphone traces and, for each trace, began choosing points at random. For 95 percent of the traces, just four randomly selected points was enough to distinguish them from all other traces in the database. In the worst (or, from another perspective, best) case, 11 measurements were necessary.

"There's a concern with this data, to what extent can we preserve anonymity," says Luis Bettencourt, a professor at the Santa Fe Institute who studies social systems. "What they are showing here, quite clearly, is that it's very hard to preserve anonymity."

But for Bettencourt, the uniqueness of people's trajectories through cities is itself precisely the type of information that analysis of cellphone data is meant to uncover. "This is interesting, from a scientific point of view, to understand how people use urban space," Bettencourt says. "It shows what kind of social systems cities are."

The researchers suspect that similar relationships might hold for other types of data. "I would not be surprised if a similar result - maybe requiring more points - would, for example, extend to web browsing," Hidalgo says.

"The space of potential combinations is really large. When a person is, in some sense, being expressed in a space in which the total number of combinations is huge, the probability that two people would have the same exact trajectory - whether it's walking or browsing - is almost nil."

.


Related Links
Massachusetts Institute Of Technology
Earth Observation News - Suppiliers, Technology and Application






Comment on this article via your Facebook, Yahoo, AOL, Hotmail login.

Share this article via these popular social media networks
del.icio.usdel.icio.us DiggDigg RedditReddit GoogleGoogle








EARTH OBSERVATION
A Closer Look at LDCM's First Scene
Greenbelt MD (SPX) Mar 25, 2013
Turning on new satellite instruments is like opening new eyes. This week, the Landsat Data Continuity Mission (LDCM) released its first images of Earth, collected at 1:40 p.m. EDT on March 18. The first image shows the meeting of the Great Plains with the Front Ranges of the Rocky Mountains in Wyoming and Colorado. The natural-color image shows the green coniferous forest of the mountains coming ... read more


EARTH OBSERVATION
Ultraviolet spectrograph observes mercury and hydrogen in GRAIL impact plumes

NASA's LRO Sees GRAIL's Explosive Farewell

Amazon's Bezos recovers Apollo 11 engines

Leaping Lunar Dust

EARTH OBSERVATION
Measuring Mars: The MAVEN Magnetometer

Opportunity Heads to Matijevic Hill

Curiosity Resumes Science Investigations

Digging for hidden treasure on Mars

EARTH OBSERVATION
Miners shoot for the stars in tech race

Space Innovation Center Will Help Govt Agencies Launch Future Space Missions

The Future of Exploration Starts With 3-D Printing

Lockheed Martin to Continue Providing Life Sciences Support To NASA

EARTH OBSERVATION
China's Next Women Astronauts

Shenzhou 10 - Next Stop: Jiuquan

China's fourth space launch center to be in use in two years

China to launch new manned spacecraft

EARTH OBSERVATION
Three astronauts blast off on express ride to ISS

Russia may recycle space station modules

New Space Station Crew Members to Launch and Dock the Same Day

ESA seeks innovators for orbiting laboratory

EARTH OBSERVATION
ILS Proton Launches Satmex 8 Satellite for Satmex

When quality counts: Arianespace reaffirms its North American market presence

SpaceX capsule returns after ISS resupply mission

SpaceX Dragon Spacecraft Carrying NASA Cargo Ready for Return to Earth

EARTH OBSERVATION
The Great Exoplanet Debate

Astronomers Detect Water in Atmosphere of Distant Planet

Distant planetary system is a super-sized solar system

Water signature in distant planet shows clues to its formation

EARTH OBSERVATION
Lasers could yield particle research tool

Paint-on plastic electronics: Aligning polymers for high performance

DARPA Envisions the Future of Machine Learning

Removing orbital debris with less risk




The content herein, unless otherwise known to be public domain, are Copyright 1995-2014 - Space Media Network. AFP, UPI and IANS news wire stories are copyright Agence France-Presse, United Press International and Indo-Asia News Service. ESA Portal Reports are copyright European Space Agency. All NASA sourced material is public domain. Additional copyrights may apply in whole or part to other bona fide parties. Advertising does not imply endorsement,agreement or approval of any opinions, statements or information provided by Space Media Network on any Web page published or hosted by Space Media Network. Privacy Statement