. 24/7 Space News .
CHIP TECH
Chip makes parallel programs run faster with less code
by Staff Writers
Boston MA (SPX) Jun 24, 2016


"Multicore systems are really hard to program," says Daniel Sanchez, an assistant professor in MIT's Department of Electrical Engineering and Computer Science. "You have to explicitly divide the work that you're doing into tasks, and then you need to enforce some synchronization between tasks accessing shared data. What this architecture does, essentially, is to remove all sorts of explicit synchronization, to make parallel programming much easier." Image courtesy Christine Daniloff/MIT.

Computer chips have stopped getting faster. For the past 10 years, chips' performance improvements have come from the addition of processing units known as cores. In theory, a program on a 64-core machine would be 64 times as fast as it would be on a single-core machine. But it rarely works out that way. Most computer programs are sequential, and splitting them up so that chunks of them can run in parallel causes all kinds of complications.

In the May/June issue of the Institute of Electrical and Electronics Engineers' journal Micro, researchers from MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) will present a new chip design they call Swarm, which should make parallel programs not only much more efficient but easier to write, too.

In simulations, the researchers compared Swarm versions of six common algorithms with the best existing parallel versions, which had been individually engineered by seasoned software developers. The Swarm versions were between three and 18 times as fast, but they generally required only one-tenth as much code - or even less. And in one case, Swarm achieved a 75-fold speedup on a program that computer scientists had so far failed to parallelize.

"Multicore systems are really hard to program," says Daniel Sanchez, an assistant professor in MIT's Department of Electrical Engineering and Computer Science, who led the project. "You have to explicitly divide the work that you're doing into tasks, and then you need to enforce some synchronization between tasks accessing shared data. What this architecture does, essentially, is to remove all sorts of explicit synchronization, to make parallel programming much easier. There's an especially hard set of applications that have resisted parallelization for many, many years, and those are the kinds of applications we've focused on in this paper."

Many of those applications involve the exploration of what computer scientists call graphs. A graph consists of nodes, typically depicted as circles, and edges, typically depicted as line segments connecting the nodes. Frequently, the edges have associated numbers called "weights," which might represent, say, the strength of correlations between data points in a data set, or the distances between cities.

Graphs crop up in a wide range of computer science problems, but their most intuitive use may be to describe geographic relationships. Indeed, one of the algorithms that the CSAIL researchers evaluated is the standard algorithm for finding the fastest driving route between two points.

Setting priorities
In principle, exploring graphs would seem to be something that could be parallelized: Different cores could analyze different regions of a graph or different paths through the graph at the same time. The problem is that with most graph-exploring algorithms, it gradually becomes clear that whole regions of the graph are irrelevant to the problem at hand. If, right off the bat, cores are tasked with exploring those regions, their exertions end up being fruitless.

Of course, fruitless analysis of irrelevant regions is a problem for sequential graph-exploring algorithms, too, not just parallel ones. So computer scientists have developed a host of application-specific techniques for prioritizing graph exploration. An algorithm might begin by exploring just those paths whose edges have the lowest weights, for instance, or it might look first at those nodes with the lowest number of edges.

What distinguishes Swarm from other multicore chips is that it has extra circuitry for handling that type of prioritization. It time-stamps tasks according to their priorities and begins working on the highest-priority tasks in parallel. Higher-priority tasks may engender their own lower-priority tasks, but Swarm slots those into its queue of tasks automatically.

Occasionally, tasks running in parallel may come into conflict. For instance, a task with a lower priority may write data to a particular memory location before a higher-priority task has read the same location. In those cases, Swarm automatically backs out the results of the lower-priority tasks. It thus maintains the synchronization between cores accessing the same data that programmers previously had to worry about themselves.

Indeed, from the programmer's perspective, using Swarm is pretty painless. When the programmer defines a function, he or she simply adds a line of code that loads the function into Swarm's queue of tasks. The programmer does have to specify the metric - such as edge weight or number of edges - that the program uses to prioritize tasks, but that would be necessary, anyway. Usually, adapting an existing sequential algorithm to Swarm requires the addition of only a few lines of code.

Keeping tabs
The hard work falls to the chip itself, which Sanchez designed in collaboration with Mark Jeffrey and Suvinay Subramanian, both MIT graduate students in electrical engineering and computer science; Cong Yan, who did her master's as a member of Sanchez's group and is now a PhD student at the University of Washington; and Joel Emer, a professor of the practice in MIT's Department of Electrical Engineering and Computer Science, and a senior distinguished research scientist at the chip manufacturer NVidia.

The Swarm chip has extra circuitry to store and manage its queue of tasks. It also has a circuit that records the memory addresses of all the data its cores are currently working on. That circuit implements something called a Bloom filter, which crams data into a fixed allotment of space and answers yes/no questions about its contents. If too many addresses are loaded into the filter, it will occasionally yield false positives - indicating "yes, I'm storing that address" - but it will never yield false negatives.

The Bloom filter is one of several circuits that help Swarm identify memory access conflicts. The researchers were able to show that time-stamping makes synchronization between cores easier to enforce. For instance, each data item is labeled with the time stamp of the last task that updated it, so tasks with later time-stamps know they can read that data without bothering to determine who else is using it.

Finally, all the cores occasionally report the time stamps of the highest-priority tasks they're still executing. If a core has finished tasks that have earlier time stamps than any of those reported by its fellows, it knows it can write its results to memory without courting any conflicts.


Thanks for being here;
We need your help. The SpaceDaily news network continues to grow but revenues have never been harder to maintain.

With the rise of Ad Blockers, and Facebook - our traditional revenue sources via quality network advertising continues to decline. And unlike so many other news sites, we don't have a paywall - with those annoying usernames and passwords.

Our news coverage takes time and effort to publish 365 days a year.

If you find our news sites informative and useful then please consider becoming a regular supporter or for now make a one off contribution.
SpaceDaily Contributor
$5 Billed Once


credit card or paypal
SpaceDaily Monthly Supporter
$5 Billed Monthly


paypal only


.


Related Links
Massachusetts Institute of Technology
Computer Chip Architecture, Technology and Manufacture
Nano Technology News From SpaceMart.com






Comment on this article via your Facebook, Yahoo, AOL, Hotmail login.

Share this article via these popular social media networks
del.icio.usdel.icio.us DiggDigg RedditReddit GoogleGoogle

Previous Report
CHIP TECH
Finessing Miniaturized Magnetics into the Microelectronics Mix
Washington DC (SPX) Jun 22, 2016
A newly-announced DARPA program is betting that unprecedented on-chip integration of workhorse electronic components, such as transistors and capacitors, with less-familiar magnetic components with names like circulators and isolators, will open an expansive pathway to more capable electromagnetic systems. The Magnetic, Miniaturized, and Monolithically Integrated Components (M3IC), program ... read more


CHIP TECH
Russian Moon Base to Hold Up to 12 People

US may approve private venture moon mission: report

Fifty Years of Moon Dust

Airbus Defence and Space to guide lunar lander to the Moon

CHIP TECH
Curiosity rover analysis suggests Mars has oxygen-rich history

NASA Scientists Discover Unexpected Mineral on Mars

Hardware for Journey to Mars is a 'Big Catch'

Opportunity Wraps up Work on 'Wheel Scuff'

CHIP TECH
Blue Origin has fourth successful rocket booster landing

TED Talks aim for wider global reach

Disney brings its brand to Shanghai with new theme park

Tech, beauty intersect in Silicon Valley

CHIP TECH
China's newest rocket ready for blast-off

China preparing for new era of space economy

China to send Chang'e-4 to south pole of moon's far-side

Experts Fear Chinese Space Station Could Crash Into Earth

CHIP TECH
Down to Earth: Returned astronaut relishes little things

NASA Ignites Fire Experiment Aboard Space Cargo Ship

A Burial Plot for the International Space Station

Three astronauts touch down after 6 months in space

CHIP TECH
LSU Chemistry Experiment Aboard Historic Suborbital Space Flight

Spaceflight contracts India's PSLV to launch 12 Planet Dove nanosats

Purdue experiment aboard Blue Origin suborbital rocket a success

Ariane 5 delivers its heaviest commercial payload

CHIP TECH
Newborn Planet Discovered Around Young Star

NASA's K2 Finds Newborn Exoplanet Around Young Star

"Electric Wind" Can Strip Earth-Like Planets of Oceans and Atmospheres

San Francisco State University astronomer helps discover giant planet orbiting 2 suns

CHIP TECH
Huge helium discovery 'a life-saving find'

Unveiling the distinctive features of a promising industrial microorganism

Scientists consider building cities of the future out of bone

Quantum calculations broaden the understanding of crystal catalysts









The content herein, unless otherwise known to be public domain, are Copyright 1995-2024 - Space Media Network. All websites are published in Australia and are solely subject to Australian law and governed by Fair Use principals for news reporting and research purposes. AFP, UPI and IANS news wire stories are copyright Agence France-Presse, United Press International and Indo-Asia News Service. ESA news reports are copyright European Space Agency. All NASA sourced material is public domain. Additional copyrights may apply in whole or part to other bona fide parties. All articles labeled "by Staff Writers" include reports supplied to Space Media Network by industry news wires, PR agencies, corporate press officers and the like. Such articles are individually curated and edited by Space Media Network staff on the basis of the report's information value to our industry and professional readership. Advertising does not imply endorsement, agreement or approval of any opinions, statements or information provided by Space Media Network on any Web page published or hosted by Space Media Network. General Data Protection Regulation (GDPR) Statement Our advertisers use various cookies and the like to deliver the best ad banner available at one time. All network advertising suppliers have GDPR policies (Legitimate Interest) that conform with EU regulations for data collection. By using our websites you consent to cookie based advertising. If you do not agree with this then you must stop using the websites from May 25, 2018. Privacy Statement. Additional information can be found here at About Us.