Thesis: A Study of Generalization in Deep Neural Networks

Published in University of California, Santa Barbara, 2021

Recommended citation: Vamshi C Madala. 2021. A study of generalization in deep neural networks. University of California, Santa Barbara. https://www.proquest.com/docview/2604320736/abstract/967E2748BD3A4BFAPQ/1?accountid=14522

One of the important challenges today in deep learning is explaining the outstanding power of generalization of deep neural networks and how they are able to avoid the curse of dimensionality and perform exceptionally well in tasks such as computer vision, natural language processing and recently, physical problems like protein folding. Various bounds have been proposed on generalization error of DNNs, however all of the proposed bounds have been empirically shown to be numerically vacuous. In this study we approach the problem of understanding generalization of DNNs by investigating the role of different attributes of DNNs, both structural - such as width, depth, kernel parameters, skip connections, etc - as well as functional - such as intermediate feature representations, receptive fields of CNN kernels, etc, in affecting the generalization. We provide experimental results showing which of the properties above influence generalization the most and which of them, the least and discuss the results relating them to the theoretical bounds.

Download paper here: Link1

Recommended citation: ‘Vamshi C Madala. 2021. A study of generalization in deep neural networks. Ann Arbor: University of California, Santa Barbara.’