Performance evaluation of descent CG methods for neural network training

I.E. Livieris and P. Pintelas, Performance Evaluation of Descent CG Methods for Neural Network Training, In Proceedings of The 9th Hellenic European Research on Computer Mathematics & Conference its Applications (HERCMA 2009), vol 11, pp. 40-46, Athens, 2009.

Abstract - Conjugate gradient methods constitute an excellent choice for efficiently training large neural networks since they don't require he evaluation of the Hessian neither the impractical storage of an approximation of it. Despite the theoretical and practical advantages of these methods their main drawback is the use of restarting procedures in order to guarantee convergence, abandoning second order derivative information. In this work, we evaluate the performance of a new class of conjugate gradient methods and we propose a new algorithm for training neural networks. The presented algorithm preserves the advantages of classical conjugate gradient methods and simultaneously avoids the inefficient restarts. Encouraging numerical experiments verify that the presented algorithm provides fast, stable and reliable convergence.