MATHEMATICAL PROPERTIES OF ACTIVATION FUNCTIONS IN ARTIFICIAL INTELLIGENCE DEVELOPMENTS: Analysis and Implications for Deep Neural Architectures

Academic Article

Publication Date:

2026

Short description:

MATHEMATICAL PROPERTIES OF ACTIVATION FUNCTIONS IN ARTIFICIAL INTELLIGENCE DEVELOPMENTS: Analysis and Implications for Deep Neural Architectures / Ferrara, Massimiliano; Ciccia, Celeste. - In: THE JOURNAL OF THE INDIAN ACADEMY OF MATHEMATICS. - ISSN 0970-5120. - 48:1(2026), pp. 1-9.

abstract:

Activation functions govern the expressive power and training dynamics of deep neural networks through their analytical properties. This paper provides a rigorous mathematical analysis of six fundamental activation functions – Linear, Sigmoid, Hyperbolic Tangent, ReLU, Parametric ReLU, and Exponential Linear Unit – examining how regularity, gradient structure, and spectral properties influence representational capacity, gradient flow stability, and convergence behavior in deep architectures. We establish formal results on the representational collapse of linear activations, derive sharp gradient decay bounds for saturating functions, prove gradient preservation theorems for piecewiselinear activations, and characterize the convergence advantages of smooth non-saturating units. Our analysis yields a unified mathematical framework connecting activation function properties to network trainability, with direct implications for the design of deep learning architectures in sequential decision-making, continuous control, and safety-critical applications

Iris type:

1.1 Articolo in rivista

List of contributors: