Shrinking massive neural networks used to model language

Share this content

Published:
December 3, 2020

A new approach could lower computing costs and increase accessibility to state-of-the-art natural language processing.

WNCG student Tianlong Chen is the lead author of a study in artificial intelligence that posits that hidden within massive neural networks, leaner subnetworks exist that can complete the same task more efficiently. The study is co-authored by WNCG assistant professor Zhangyang "Atlas" Wang, along with Jonathan Frankle of MIT CSAIL, and Shiyu Chang, Sijia Liu, and Yang Zhang, all of the MIT-IBM Watson AI Lab.

The study will be presented next month at the Conference on Neural Information Processing Systems.

Tianlong Chen joined Texas ECE in Fall 2020. He received his Bachelor degree (B.S.c) of Applied Mathematics and dual degree (B.Eng.) of Computer Science, School of the Gifted Yong, University of Science and Technology of China, 2017.

Zhangyang "Atlas" Wang is an Assistant Professor of Electrical and Computer Engineering at The University of Texas at Austin beginning in Fall 2020. He was an Assistant Professor of Computer Science and Engineering, at the Texas A&M University, from 2017 to 2020. Prof. Wang is broadly interested in the fields of machine learning, computer vision, optimization, and their interdisciplinary applications.

Read the article by Daniel Ackerman of the MIT News Office:

https://news.mit.edu/2020/neural-model-language-1201

More information

https://news.mit.edu/2020/neural-model-language-1201