. 24/7 Space News .
TECH SPACE
System could let thousands of researchers contribute to data analysis projects
by Staff Writers
Boston MA (SPX) Nov 02, 2017


"I think that the concept of massive and open data science can be really leveraged for areas where there's a strong social impact but not necessarily a single profit-making or government organization that is coordinating responses," MIT graduate student Micah Smith says about FeatureHub.

In the analysis of big data sets, the first step is usually the identification of "features" - data points with particular predictive power or analytic utility. Choosing features usually requires some human intuition. For instance, a sales database might contain revenues and date ranges, but it might take a human to recognize that average revenues - revenues divided by the sizes of the ranges - is the really useful metric.

MIT researchers have developed a new collaboration tool, dubbed FeatureHub, intended to make feature identification more efficient and effective. With FeatureHub, data scientists and experts on particular topics could log on to a central site and spend an hour or two reviewing a problem and proposing features. Software then tests myriad combinations of features against target data, to determine which are most useful for a given predictive task.

In tests, the researchers recruited 32 analysts with data science experience, who spent five hours each with the system, familiarizing themselves with it and using it to propose candidate features for each of two data-science problems.

The predictive models produced by the system were tested against those submitted to a data-science competition called Kaggle. The Kaggle entries had been scored on a 100-point scale, and the FeatureHub models were within three and five points of the winning entries for the two problems.

But where the top-scoring entries were the result of weeks or even months of work, the FeatureHub entries were produced in a matter of days. And while 32 collaborators on a single data science project is a lot by today's standards, Micah Smith, an MIT graduate student in electrical engineering and computer science who helped lead the project, has much larger ambitions.

FeatureHub - like its name - was inspired by GitHub, an online repository of open-source programming projects, some of which have drawn thousands of contributors. Smith hopes that FeatureHub might someday attain a similar scale.

"I do hope that we can facilitate having thousands of people working on a single solution for predicting where traffic accidents are most likely to strike in New York City or predicting which patients in a hospital are most likely to require some medical intervention," he says.

"I think that the concept of massive and open data science can be really leveraged for areas where there's a strong social impact but not necessarily a single profit-making or government organization that is coordinating responses."

Smith and his colleagues presented a paper describing FeatureHub at the IEEE International Conference on Data Science and Advanced Analytics. His coauthors on the paper are his thesis advisor, Kalyan Veeramachaneni, a principal research scientist at MIT's Laboratory for Information and Decision Systems, and Roy Wedge, who began working with Veeramachaneni's group as an MIT undergraduate and is now a software engineer at Feature Labs, a data science company based on the group's work.

FeatureHub's user interface is built on top of a common data-analysis software suite called the Jupyter Notebook, and the evaluation of feature sets is performed by standard machine-learning software packages. Features must be written in the Python programming language, but their design has to follow a template that intentionally keeps the syntax simple. A typical feature might require between five and 10 lines of code.

The MIT researchers wrote code that mediates between the other software packages and manages data, pooling features submitted by many different users and tracking those collections of features that perform best on particular data analysis tasks.

In the past, Veeramachaneni's group has developed software that automatically generates features by inferring relationships between data from the manner in which they're organized. When that organizational information is missing, however, the approach is less effective.

Still, Smith imagines, automatic feature synthesis could be used in conjunction with FeatureHub, getting projects started before volunteers have begun to contribute to them, saving the grunt work of enumerating the obvious features, and augmenting the best-performing sets of features contributed by humans.

Research Report: FeatureHub: Towards collaborative data science

TECH SPACE
New material for digital memories of the future
Linkoping, Sweden (SPX) Oct 19, 2017
Professor Martijn Kemerink of Linkoping University has worked with colleagues in Spain and the Netherlands to develop the first material with conductivity properties that can be switched on and off using ferroelectric polarisation. The phenomenon can be used for small and flexible digital memories of the future, and for completely new types of solar cells. In an article published in ... read more

Related Links
Massachusetts Institute of Technology
Space Technology News - Applications and Research


Thanks for being there;
We need your help. The SpaceDaily news network continues to grow but revenues have never been harder to maintain.

With the rise of Ad Blockers, and Facebook - our traditional revenue sources via quality network advertising continues to decline. And unlike so many other news sites, we don't have a paywall - with those annoying usernames and passwords.

Our news coverage takes time and effort to publish 365 days a year.

If you find our news sites informative and useful then please consider becoming a regular supporter or for now make a one off contribution.
SpaceDaily Monthly Supporter
$5+ Billed Monthly


paypal only
SpaceDaily Contributor
$5 Billed Once


credit card or paypal


Comment using your Disqus, Facebook, Google or Twitter login.

Share this article via these popular social media networks
del.icio.usdel.icio.us DiggDigg RedditReddit GoogleGoogle

TECH SPACE
Pope asks spacemen life's big questions in ISS live chat

Plants and psychological well-being in space

Spacewalkers fix robotic arm in time to grab next cargo ship

NASA develops and tests new housing for in-orbit science payloads

TECH SPACE
Thruster for Mars mission breaks records

Draper and Sierra Nevada Corporation announce new agreement for space missions

Aerojet Rocketdyne breaks ground on advanced manufacturing center in Huntsville

New solid rocket motor development facility completed at Spaceport America

TECH SPACE
Mars Rover Mission Progresses Toward Resumed Drilling

Solar eruptions could electrify Martian moons

MAVEN finds Mars has a twisted tail

Mine craft for Mars

TECH SPACE
Space will see Communist loyalty: Chinese astronaut

China launches three satellites

Mars probe to carry 13 types of payload on 2020 mission

UN official commends China's role in space cooperation

TECH SPACE
Myanmar to launch own satellite system-2 in 2019: vice president

Eutelsat's Airbus-built full electric EUTELSAT 172B satellite reaches geostationary orbit

Turkey, Russia to Enhance Cooperation in the Field of Space Technologies

SpaceX launches 10 satellites for Iridium mobile network

TECH SPACE
Turning a material upside down can sometimes make it softer

Nanoscale textures make glass invisible

Discovery of a new structure family of oxide-ion conductors SrYbInO4

Technique offers advance in testing micro-scale compressive strength of cement

TECH SPACE
Comet mission reveals 'missing link' in our understanding of planet formation

Astronomers discover sunscreen snow falling on hot exoplanet

Marine microbes living beneath seabed resort to cannibalism

New NASA study improves search for habitable worlds

TECH SPACE
Haumea, the most peculiar of Pluto companions, has a ring around it

Ring around a dwarf planet detected

Helicopter test for Jupiter icy moons radar

Solving the Mystery of Pluto's Giant Blades of Ice









The content herein, unless otherwise known to be public domain, are Copyright 1995-2024 - Space Media Network. All websites are published in Australia and are solely subject to Australian law and governed by Fair Use principals for news reporting and research purposes. AFP, UPI and IANS news wire stories are copyright Agence France-Presse, United Press International and Indo-Asia News Service. ESA news reports are copyright European Space Agency. All NASA sourced material is public domain. Additional copyrights may apply in whole or part to other bona fide parties. All articles labeled "by Staff Writers" include reports supplied to Space Media Network by industry news wires, PR agencies, corporate press officers and the like. Such articles are individually curated and edited by Space Media Network staff on the basis of the report's information value to our industry and professional readership. Advertising does not imply endorsement, agreement or approval of any opinions, statements or information provided by Space Media Network on any Web page published or hosted by Space Media Network. General Data Protection Regulation (GDPR) Statement Our advertisers use various cookies and the like to deliver the best ad banner available at one time. All network advertising suppliers have GDPR policies (Legitimate Interest) that conform with EU regulations for data collection. By using our websites you consent to cookie based advertising. If you do not agree with this then you must stop using the websites from May 25, 2018. Privacy Statement. Additional information can be found here at About Us.