Toggle Main Menu Toggle Search

Open Access padlockePrints

On Learnable Parameters of Optimal and Suboptimal Deep Learning Models

Lookup NU author(s): Ziwei Zheng, Dr Huizhi LiangORCiD, Dr Varun OjhaORCiD

Downloads


Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


Abstract

We scrutinize the structural and operational aspects of deep learning models, particularly focusing on the nuances of learnable parameters (weight) statistics, distribution, node interaction, and visualization. By establishing correlations between variance in weight patterns and overall network performance, we investigate the varying (optimal and suboptimal) performances of various deep-learning models. Our empirical analysis extends across widely recognized datasets such as MNIST, Fashion-MNIST, and CIFAR-10, and various deep learning models such as deep neural networks (DNNs), convolutional neural networks (CNNs), and vision transformer (ViT), enabling us to pinpoint characteristics of learnable parameters that correlate with successful networks. Through extensive experiments on the diverse architectures of deep learning models, we shed light on the critical factors that influence the functionality and efficiency of DNNs. Our findings reveal that successful networks, irrespective of datasets or models, are invariably similar to other successful networks in their converged weights statistics and distribution, while poor-performing networks vary in their weights. In addition, our research shows that the learnable parameters of widely varied deep learning models such as DNN, CNN, and ViT exhibit similar learning characteristics.


Publication metadata

Author(s): Zheng Z, Liang H, Snasel V, Latora V, Pardalos P, Nicosia G, Ojha V

Publication type: Conference Proceedings (inc. Abstract)

Publication status: Published

Conference Name: 38th International Conference on Neural Information Processing

Year of Conference: 2025

Pages: 126-140

Online publication date: 24/06/2025

Acceptance date: 21/08/2024

Date deposited: 22/10/2024

Publisher: Springer, Singapore

URL: https://doi.org/10.1007/978-981-96-6582-2_9

DOI: 10.1007/978-981-96-6582-2_9

ePrints DOI: 10.57711/n3qt-8894

Library holdings: Search Newcastle University Library for this item

Series Title: Lecture Notes in Computer Science

ISBN: 9789819665815


Share