24/7 Space News
ROBO SPACE
New insights into training dynamics of deep classifiers
An MIT study takes a first theoretical analysis inside a convolutional neural network and provides new insights into how properties emerge during the network's training.
ADVERTISEMENT
     
New insights into training dynamics of deep classifiers

Boston MA (SPX) Mar 09, 2023

A new study from researchers at MIT and Brown University characterizes several properties that emerge during the training of deep classifiers, a type of artificial neural network commonly used for classification tasks such as image classification, speech recognition, and natural language processing.

The paper, "Dynamics in Deep Classifiers trained with the Square Loss: Normalization, Low Rank, Neural Collapse and Generalization Bounds," published in the journal Research, is the first of its kind to theoretically explore the dynamics of training deep classifiers with the square loss and how properties such as rank minimization, neural collapse, and dualities between the activation of neurons and the weights of the layers are intertwined.

In the study, the authors focused on two types of deep classifiers: fully connected deep networks and convolutional neural networks (CNNs).

A previous study examined the structural properties that develop in large neural networks at the final stages of training. That study focused on the last layer of the network and found that deep networks trained to fit a training dataset will eventually reach a state known as "neural collapse." When neural collapse occurs, the network maps multiple examples of a particular class (such as images of cats) to a single template of that class. Ideally, the templates for each class should be as far apart from each other as possible, allowing the network to accurately classify new examples.

An MIT group based at the MIT Center for Brains, Minds and Machines studied the conditions under which networks can achieve neural collapse. Deep networks that have the three ingredients of stochastic gradient descent (SGD), weight decay regularization (WD), and weight normalization (WN) will display neural collapse if they are trained to fit their training data. The MIT group has taken a theoretical approach - as compared to the empirical approach of the earlier study - proving that neural collapse emerges from the minimization of the square loss using SGD, WD, and WN.

Co-author and MIT McGovern Institute postdoc Akshay Rangamani states, "Our analysis shows that neural collapse emerges from the minimization of the square loss with highly expressive deep neural networks. It also highlights the key roles played by weight decay regularization and stochastic gradient descent in driving solutions towards neural collapse."

Weight decay is a regularization technique that prevents the network from over-fitting the training data by reducing the magnitude of the weights. Weight normalization scales the weight matrices of a network so that they have a similar scale. Low rank refers to a property of a matrix where it has a small number of non-zero singular values. Generalization bounds offer guarantees about the ability of a network to accurately predict new examples that it has not seen during training.

The authors found that the same theoretical observation that predicts a low-rank bias also predicts the existence of an intrinsic SGD noise in the weight matrices and in the output of the network. This noise is not generated by the randomness of the SGD algorithm but by an interesting dynamic trade-off between rank minimization and fitting of the data, which provides an intrinsic source of noise similar to what happens in dynamic systems in the chaotic regime. Such a random-like search may be beneficial for generalization because it may prevent over-fitting.

"Interestingly, this result validates the classical theory of generalization showing that traditional bounds are meaningful. It also provides a theoretical explanation for the superior performance in many tasks of sparse networks, such as CNNs, with respect to dense networks," comments co-author and MIT McGovern Institute postdoc Tomer Galanti. In fact, the authors prove new norm-based generalization bounds for CNNs with localized kernels, that is a network with sparse connectivity in their weight matrices.

In this case, generalization can be orders of magnitude better than densely connected networks. This result validates the classical theory of generalization, showing that its bounds are meaningful, and goes against a number of recent papers expressing doubts about past approaches to generalization. It also provides a theoretical explanation for the superior performance of sparse networks, such as CNNs, with respect to dense networks. Thus far, the fact that CNNs and not dense networks represent the success story of deep networks has been almost completely ignored by machine learning theory. Instead, the theory presented here suggests that this is an important insight in why deep networks work as well as they do.

"This study provides one of the first theoretical analyses covering optimization, generalization, and approximation in deep networks and offers new insights into the properties that emerge during training," says co-author Tomaso Poggio, the Eugene McDermott Professor at the Department of Brain and Cognitive Sciences at MIT and co-director of the Center for Brains, Minds and Machines. "Our results have the potential to advance our understanding of why deep learning works as well as it does."

Research Report:"Dynamics in Deep Classifiers Trained with the Square Loss: Normalization, Low Rank, Neural Collapse, and Generalization Bounds"

Related Links
McGovern Institute for Brain Research
All about the robots on Earth and beyond!

Subscribe Free To Our Daily Newsletters

RELATED CONTENT
The following news reports may link to other Space Media Network websites.
ROBO SPACE
Kenyan innovators turn e-waste to bio-robotic prosthetic
Nairobi (AFP) March 8, 2023
Two portraits of Albert Einstein hang on the walls of a makeshift laboratory on Nairobi's outskirts, inspiring a pair of self-taught Kenyan innovators who have built a bio-robotic prosthetic arm out of electronic scrap. Cousins Moses Kiuna, 29, and David Gathu, 30, created their first prosthetic arm in 2012, after their neighbour lost a limb in an industrial accident. But their latest invention is a significant upgrade, according to the duo. The device uses a headset receiver to pick up brai ... read more

ADVERTISEMENT
ADVERTISEMENT
ROBO SPACE
SpaceX cargo resupply mission CRS-27 scheduled for launch Tuesday

NASA SpaceX Crew-5 splashes down after 5-month mission

China to revamp science, tech in face of foreign 'suppression'

DLR goes all in with new technology at the Startup Factory

ROBO SPACE
SpaceX launches Cargo Dragon carrying supplies and experiments to ISS

Arianespace inks deal to launch at least two Vega-C rockets

Launch of Relativity Space's 3D-printed rocket aborted

Private firm to launch maiden rocket flight in Spain

ROBO SPACE
ExoMars rover testing moves ahead and deep down

ExoMars: Back on track for the Red Planet

Don't Dream and Drive: Sols 3764-3765

Crossing Off Our Liens at Tapo Caparo: Sols 3769-3770

ROBO SPACE
China's Shenzhou-15 astronauts to return in June

China's space technology institute sees launches of 400 spacecraft

Shenzhou XV crew takes second spacewalk

China conducts ignition test in Mengtian space lab module

ROBO SPACE
LeoLabs expands space safety coverage with new site in Argentina.

Spacetime will orchestrate LEO network for Rivada constellation

Satellite constellations multiply on profit hopes, geopolitics

HawkEye 360's latest satellite cluster begins operation

ROBO SPACE
Antenova's tiny GNSS module with integrated antenna, high precision and low power

Student-built satellite uses 'beach ball' for an antenna

Airbus partners with Kythera for OneSat mission sizing software

Keysight introduces 2 GHz real-time spectrum analysis solution for satellite operators

ROBO SPACE
Rutgers scientists identify substance that may have sparked life on earth

Distant star TOI-700 has two potentially habitable planets

DLR Gottingen helps in the search for signs of life in space

CHEOPS mission extended

ROBO SPACE
Inspiring mocktail menu served up by Space Juice winners

Study finds ocean currents may affect rotation of Europa's icy crust

First the Moon, now Jupiter

Newly discovered form of salty ice could exist on surface of extraterrestrial moons

Subscribe Free To Our Daily Newsletters


ADVERTISEMENT



The content herein, unless otherwise known to be public domain, are Copyright 1995-2023 - Space Media Network. All websites are published in Australia and are solely subject to Australian law and governed by Fair Use principals for news reporting and research purposes. AFP, UPI and IANS news wire stories are copyright Agence France-Presse, United Press International and Indo-Asia News Service. ESA news reports are copyright European Space Agency. All NASA sourced material is public domain. Additional copyrights may apply in whole or part to other bona fide parties. All articles labeled "by Staff Writers" include reports supplied to Space Media Network by industry news wires, PR agencies, corporate press officers and the like. Such articles are individually curated and edited by Space Media Network staff on the basis of the report's information value to our industry and professional readership. Advertising does not imply endorsement, agreement or approval of any opinions, statements or information provided by Space Media Network on any Web page published or hosted by Space Media Network. General Data Protection Regulation (GDPR) Statement Our advertisers use various cookies and the like to deliver the best ad banner available at one time. All network advertising suppliers have GDPR policies (Legitimate Interest) that conform with EU regulations for data collection. By using our websites you consent to cookie based advertising. If you do not agree with this then you must stop using the websites from May 25, 2018. Privacy Statement. Additional information can be found here at About Us.