In a study published in Frontiers in Science, scientists from Purdue University and the Georgia Institute of Technology describe practical strategies to overcome the hardware bottlenecks of current computing platforms. They focus on reducing data movement between processors and memory as a route to lower energy use in AI workloads. Most conventional computers follow the von Neumann model, which separates processing units from memory. Each time an AI system needs information, data must travel between these units, a process known as the memory wall that dominates both latency and power consumption in AI processing.
The team contends that integrating processing functions within or directly beside memory would lessen this bottleneck. Such designs would make it possible to deploy new classes of algorithms that keep data local, cutting transfer overhead and supporting AI applications that must operate efficiently at the edge rather than in large server farms.
"Language processing models have grown 5,000-fold in size over the last four years. This alarmingly rapid expansion makes it crucial that AI is as efficient as possible. That means fundamentally rethinking how computers are designed," said Kaushik Roy, Professor of Electrical and Computer Engineering at Purdue University and lead author of the study.
The researchers look to the human brain as a template for more efficient AI computing. In biological neural networks, information storage and processing occur in the same place, avoiding the constant shuttling of data that characterizes standard digital machines.
When a neuron receives signals from other neurons, it accumulates electrical charge in its membrane potential. Once this potential crosses a threshold, the neuron generates a spike that passes information on to other cells, so communication occurs mainly when meaningful changes happen.
This operating principle has inspired spiking neural networks, or SNNs, which process information as sequences of discrete events in time. SNNs are well suited to tasks driven by irregular, occasional inputs, while conventional deep learning networks continue to dominate data heavy jobs such as face recognition, image classification, image analysis, and 3D reconstruction.
"The capabilities of the human brain have long been an inspiration for AI systems. Machine learning algorithms came from the brain's ability to learn and generalize from input data. Now we want to take this to the next level and recreate the brain's efficient processing mechanisms," said Adarsh Kosta, co-author and researcher at Purdue University.
The authors argue that neuro inspired computing could extend AI beyond large scale data centers into mobile and embedded platforms. They highlight autonomous drones as one example where on board intelligence must respond instantly to changing conditions.
In a search and rescue mission, a drone must sense its surroundings, detect and track objects, choose actions, and plan flight paths in real time. If it relies heavily on remote cloud services, communication delays can disrupt these time critical functions, so much of the processing needs to stay on the vehicle.
In these situations, the computing hardware must remain light and energy efficient so the drone can carry it and still achieve useful range. One option is to fit event based cameras that only emit data when pixel intensities change sufficiently instead of streaming continuous video frames.
Because event driven sensors generate less data, they use less power and reduce the bandwidth required for perception. However, their intermittent, timing dependent outputs do not match well with traditional frame based processors and algorithms.
SNN based processors align more closely with the operation of event based cameras because they naturally handle temporal spike trains. By exploiting these sparse event streams, SNNs can extract relevant information while avoiding constant, redundant computation.
This combination of neuromorphic algorithms and event based sensing could give drones either more capability at a given battery size or longer operating time for the same mission profile. The researchers suggest that similar efficiency gains could support AI systems in transportation, medical devices, and other domains that need low power, on device intelligence.
"AI is one of the most transformative technologies of the 21st century. However, to move it out of data centers and into the real world, we need to dramatically reduce its energy use. With less data transfer and more efficient processing, AI can fit into small, affordable devices with batteries that last longer," said Tanvi Sharma, co-author and researcher at Purdue University.
To fully exploit SNNs, the authors note that new hardware must tackle the memory wall directly. They emphasize compute in memory, or CIM, architectures as a promising way to execute AI calculations where data resides.
CIM systems perform arithmetic inside or at the edge of memory arrays, cutting down on the cost of shuttling values between separate logic and storage blocks. This layout is particularly useful for SNNs, which must frequently read and update neuron membrane potentials over time.
The paper outlines two primary CIM options. Analog implementations use currents flowing through memory cells themselves to carry out vector matrix operations, while digital schemes embed standard logic operations within or alongside the memory arrays.
Digital CIM offers higher numerical accuracy and robustness to noise but tends to consume more energy than analog approaches. Analog CIM can be more energy efficient, though it faces challenges in precision, variability, and design complexity.
The authors survey several device technologies that could underpin CIM hardware but conclude that no single choice dominates across all metrics. Instead, they recommend selecting device types and circuit techniques on an application specific basis.
They argue that future platforms should treat algorithms, circuits, and memory technology as parts of a single co designed stack. With this approach, each use case can adopt the most suitable building blocks rather than forcing one universal solution.
"Co-designing the hardware and algorithms together is the only way to break through the memory wall and deliver fast, lightweight, low-power AI," said Roy. "This collaborative design approach could also create platforms that are far more versatile by switching between traditional AI networks and neuro-inspired networks depending on the application."
The study concludes that brain inspired algorithms combined with compute in memory hardware could ease the energy burden of AI and make advanced capabilities more accessible on small devices. The authors point to ongoing research in neuromorphic computing as a route to systems that dynamically select between conventional and spiking networks depending on workload demands.
Research Report:Breaking the memory wall: next-generation artificial intelligence hardware
Related Links
Frontiers
Computer Chip Architecture, Technology and Manufacture
Nano Technology News From SpaceMart.com
| Subscribe Free To Our Daily Newsletters |
| Subscribe Free To Our Daily Newsletters |