Monday, June 18, 2012

Deep learning made easier by linear transformations in perceptrons

Tapani Raiko gave a talk entitled "Deep learning made easier by linear transformations in perceptrons" in a HIIT seminar. The talk was based on a paper by Raiko, Harri Valpola and Yann LeCun (New York University) presented in AISTATS 2012 conference. By overparameterization using shortcut connections, they are able to make the learning problem of deep feedforward networks easier. The use of shortcut connections aims at separating the problems of learning the linear and nonlinear parts of the overall input-output mapping. The theoretical motivation for the approach is that it makes the Fisher information matrix closer to a diagonal matrix, and thus standard gradient closer to the natural gradient. Raiko presented results of three experiments, using MNIST classification, CIFAR-10 classification and MNIST autoencoder tasks. Raiko concluded that making parameters more independent will also help variational Bayes and MCMC. Unsupervised pre-training could be used for further improvement.

No comments: