Optimising neural network training efficiency through spectral parameter-based multiple adaptive learning rates
The process of training deep neural networks involves heavily solving optimization problems. Finding optimal values for different hyperparameters makes training neural networks challenging. A hyperparameter called learning rate or step size is one of the most crucial factors in optimization using gr...
| Main Author: | |
|---|---|
| Format: | Final Year Project / Dissertation / Thesis |
| Published: |
2023
|
| Subjects: | |
| Online Access: | http://eprints.utar.edu.my/6338/ http://eprints.utar.edu.my/6338/1/4._Revised_Dissertation_Koay_Yeong_Lin.pdf |
| _version_ | 1848886650766098432 |
|---|---|
| author | Koay, Yeong Lin |
| author_facet | Koay, Yeong Lin |
| author_sort | Koay, Yeong Lin |
| building | UTAR Institutional Repository |
| collection | Online Access |
| description | The process of training deep neural networks involves heavily solving optimization problems. Finding optimal values for different hyperparameters makes training neural networks challenging. A hyperparameter called learning rate or step size is one of the most crucial factors in optimization using gradient-based approaches. A small learning rate might result in slow convergence and the loss
function will get stuck in the local minimum, whereas a large learning rate might hinder convergence or cause divergence. Currently, most of the common optimization algorithms use a fixed learning rate or a simplified adaptive updating scheme in every iteration. In this project, we propose a stochastic gradient descent method with multiple adaptive learning rates (MAdaGrad) and A am with multiple adaptive learning rates (MAdaGrad Adam). In the derivation of the updating formula, we aim to minimize the log-determinant norm and allow them to satisfy the secant equation. We apply the Lagrange multiplier to the
minimization problem and the Lagrange multiplier can be
approximated by using the Newton-Raphson method. The proposed algorithms update the learning rate in every iteration based on the approximated spectrum of the Hessian of the loss function. The methods were compared to the existing optimization methods in deep learning, stochastic gradient descent method (SGD) and Adam. Some datasets were used to observe the performance of the proposed methods. The numerical results show that the proposed methods perform better than SGD and Adam. Hence, the proposed MAdaGrad
and MAdaGrad Adam can be alternative optimizer in machine learning. |
| first_indexed | 2025-11-15T19:41:52Z |
| format | Final Year Project / Dissertation / Thesis |
| id | utar-6338 |
| institution | Universiti Tunku Abdul Rahman |
| institution_category | Local University |
| last_indexed | 2025-11-15T19:41:52Z |
| publishDate | 2023 |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | utar-63382024-04-14T10:51:15Z Optimising neural network training efficiency through spectral parameter-based multiple adaptive learning rates Koay, Yeong Lin Q Science (General) QA Mathematics The process of training deep neural networks involves heavily solving optimization problems. Finding optimal values for different hyperparameters makes training neural networks challenging. A hyperparameter called learning rate or step size is one of the most crucial factors in optimization using gradient-based approaches. A small learning rate might result in slow convergence and the loss function will get stuck in the local minimum, whereas a large learning rate might hinder convergence or cause divergence. Currently, most of the common optimization algorithms use a fixed learning rate or a simplified adaptive updating scheme in every iteration. In this project, we propose a stochastic gradient descent method with multiple adaptive learning rates (MAdaGrad) and A am with multiple adaptive learning rates (MAdaGrad Adam). In the derivation of the updating formula, we aim to minimize the log-determinant norm and allow them to satisfy the secant equation. We apply the Lagrange multiplier to the minimization problem and the Lagrange multiplier can be approximated by using the Newton-Raphson method. The proposed algorithms update the learning rate in every iteration based on the approximated spectrum of the Hessian of the loss function. The methods were compared to the existing optimization methods in deep learning, stochastic gradient descent method (SGD) and Adam. Some datasets were used to observe the performance of the proposed methods. The numerical results show that the proposed methods perform better than SGD and Adam. Hence, the proposed MAdaGrad and MAdaGrad Adam can be alternative optimizer in machine learning. 2023 Final Year Project / Dissertation / Thesis NonPeerReviewed application/pdf http://eprints.utar.edu.my/6338/1/4._Revised_Dissertation_Koay_Yeong_Lin.pdf Koay, Yeong Lin (2023) Optimising neural network training efficiency through spectral parameter-based multiple adaptive learning rates. Master dissertation/thesis, UTAR. http://eprints.utar.edu.my/6338/ |
| spellingShingle | Q Science (General) QA Mathematics Koay, Yeong Lin Optimising neural network training efficiency through spectral parameter-based multiple adaptive learning rates |
| title | Optimising neural network training efficiency through spectral parameter-based multiple adaptive learning rates |
| title_full | Optimising neural network training efficiency through spectral parameter-based multiple adaptive learning rates |
| title_fullStr | Optimising neural network training efficiency through spectral parameter-based multiple adaptive learning rates |
| title_full_unstemmed | Optimising neural network training efficiency through spectral parameter-based multiple adaptive learning rates |
| title_short | Optimising neural network training efficiency through spectral parameter-based multiple adaptive learning rates |
| title_sort | optimising neural network training efficiency through spectral parameter-based multiple adaptive learning rates |
| topic | Q Science (General) QA Mathematics |
| url | http://eprints.utar.edu.my/6338/ http://eprints.utar.edu.my/6338/1/4._Revised_Dissertation_Koay_Yeong_Lin.pdf |