Invited Talks

Classification with Stable Invariants

Stephane Mallat, Joan Bruna
IHES, Ecole Polytechnique, Paris

Classification often requires to reduce variability with invariant representations, which are stable to deformations, and retain enough information for discrimination.
Deep convolution networks provide architectures to construct such representations. With adapted wavelet filters and a modulus pooling non-linearity, a deep convolution network is shown to compute stable invariants relatively to a chosen group of transformations. It may correspond to translations, rotations, or a more complex group learned from data. Renormalizing this scattering transform leads to a representation similar to a Fourier transform, but stable to deformations as opposed to Fourier. Enough information is preserved to recover signal approximations from their scattering representation. Image and audio classification examples are shown with linear classifiers.

Structured sparsity and convex optimization

Francis Bach

The concept of parsimony is central in many scientific domains. In the context of statistics, signal processing or machine learning, it takes the form of variable or feature selection problems, and is commonly used in two situations: First,  to make the model or the prediction more interpretable or cheaper to use, i.e., even if the underlying problem does not admit sparse solutions, one looks for the best sparse approximation. Second, sparsity can also be used given prior knowledge that the model should be sparse.  In these two situations, reducing parsimony to finding models with low cardinality turns out to be limiting, and structured parsimony has emerged as a fruitful practical extension, with applications to image processing, text processing or bioinformatics. In this talk, I will review recent results on structured sparsity, as it applies to machine learning and
signal processing.  (Joint work with R. Jenatton, J. Mairal and G. Obozinski)

Contributed Oral Presentations

Online Incremental Feature Learning with Denoising Autoencoder

Guanyu Zhou, Kihyuk Sohn, Honglak Lee

While determining the model complexity is an important problem in machine learning, many feature learning algorithms rely on cross-validation to choose an optimal number of features, which is usually infeasible for online learning from a massive stream of data. In this paper, we propose an incremental feature learning algorithm to determine the optimal model complexity for large-scale, online datasets based on the denoising autoencoder. This algorithm is composed of two processes: adding features and merging features. Specifically, it adds new features to minimize the residual of the objective function and merges similar features to obtain a compact feature representation and prevent over-fitting. Our experiments show that the model quickly converges to the optimal number of features in a large-scale online setting, and outperforms the (non-incremental) denoising autoencoder, as well as deep belief networks and stacked denoising autoencoders for classification tasks. Further, the algorithm is particularly effective in recognizing new patterns when the data distribution changes over time in the massive online data stream.

Improved Preconditioner in Hessian Free Optimization

Olivier Chapelle, Dumitru Erhan

We investigate the use of Hessian Free optimization for learning deep autoencoders. One of the critical components in that algorithm is the choice of the preconditioner. We argue in this paper that the Jacobi preconditioner leads to faster optimization and we show how it can be accurately and efficiently estimated using a randomized algorithm.

Contributed Poster Presentations

Gaussian-Bernoulli Deep Boltzmann Machine
Kyunghyun Cho, Tapani Raiko, Alexander Ilin

Deep Learning Made Easier by Linear Transformations in Perceptrons
Tapani Raiko, Harri Valpola, Yann LeCun

Importance of Cross-Layer Cooperation for Learning Deep Feature Hierarchies
Grégoire Montavon, Mikio Braun, Klaus-Robert Müller

On spatio-temporal sparse coding: Analysis and an algorithm
Roland Memisevic

Improving the speed of neural networks on CPUs
Vincent Vanhoucke, Andrew Senior, Mark Mao

Unsupervised Structural Learning of Word Topics and Sentence Topics
Jen-Tzung Chien, Ying-Lan Chang

Reading Digits in Natural Images with Unsupervised Feature Learning
Yuval Netzer, Tao Wang, Alessandro Bissacco, Bo Wu, Adam Coates, Andrew Ng

Stereopsis via Deep Learning
Roland Memisevic, Christian Conrad

Learning Topographic Representations for Linearly Correlated Components
Hiroaki Sasaki, Aapo Hyvärinen, Michael Gutmann, Hayaru Shouno

A Deep Neural Network for Acoustic-Articulatory Speech Inversion
Benigno Uria, Steve Renals

Building Low-Dimensional Features from Learned Kernels
Artem Sokolov, Tanguy Urvoy, Hai-Son Le

Beyond Spatial Pyramids: Receptive Field Learning for Pooled Image Features
Yangqing Jia, Chang Huang

Unsupervised feature learning for electronic nose data applied to Bacteria Identification in Blood
Martin Längkvist, Amy Loutfi

Training Restricted Boltzmann Machines on Word Observations
George Dahl, Ryan Adams, Hugo Larochelle

MDDAG: learning deep decision DAGs in a Markov decision process setup
Djalel Benbouzid, Robert Busa-Fekete, Balazs Kegl

A new way to learn acoustic events
Navdeep Jaitly, Geoffrey Hinton

On the Applicability of Unsupervised Feature Learning for Object Recognition in RGB-D Data
Manuel Blum, Jost Tobias Springenberg, Jan Wülfing, Martin Riedmiller

Toward the Implementation of a Quantum RBM
Misha Denil, Nando de Freitas

Results from a Semi-Supervised Feature Learning Competition
D. Sculley

Adaptive Hybrid Monte Carlo with Bayesian Parametric Bandits and Predictive Adaptation Measures
Ziyu Wang, Nando de Freitas

Information theoretic learning of robust deep representations
Nicolas Pinchaud

Unsupervised learning of visual invariance with temporal coherence
Will Zou, Andrew Ng, Kai Yu