## Invited Talks

### Classification with Stable Invariants

8:40am

#### Stephane Mallat, Joan Bruna

IHES, Ecole Polytechnique, Paris

Classification often requires to reduce variability with invariant representations, which are stable to deformations, and retain enough information for discrimination.

Deep convolution networks provide architectures to construct such representations. With adapted wavelet filters and a modulus pooling non-linearity, a deep convolution network is shown to compute stable invariants relatively to a chosen group of transformations. It may correspond to translations, rotations, or a more complex group learned from data. Renormalizing this scattering transform leads to a representation similar to a Fourier transform, but stable to deformations as opposed to Fourier. Enough information is preserved to recover signal approximations from their scattering representation. Image and audio classification examples are shown with linear classifiers.

### Structured sparsity and convex optimization

4:00pm

#### Francis Bach

INRIA

The concept of parsimony is central in many scientific domains. In the context of statistics, signal processing or machine learning, it takes the form of variable or feature selection problems, and is commonly used in two situations: First, to make the model or the prediction more interpretable or cheaper to use, i.e., even if the underlying problem does not admit sparse solutions, one looks for the best sparse approximation. Second, sparsity can also be used given prior knowledge that the model should be sparse. In these two situations, reducing parsimony to finding models with low cardinality turns out to be limiting, and structured parsimony has emerged as a fruitful practical extension, with applications to image processing, text processing or bioinformatics. In this talk, I will review recent results on structured sparsity, as it applies to machine learning and

signal processing. (Joint work with R. Jenatton, J. Mairal and G. Obozinski)

## Contributed Oral Presentations

### Online Incremental Feature Learning with Denoising Autoencoder

5:25pm

#### Guanyu Zhou, Kihyuk Sohn, Honglak Lee

While determining the model complexity is an important problem in machine learning, many feature learning algorithms rely on cross-validation to choose an optimal number of features, which is usually infeasible for online learning from a massive stream of data. In this paper, we propose an incremental feature learning algorithm to determine the optimal model complexity for large-scale, online datasets based on the denoising autoencoder. This algorithm is composed of two processes: adding features and merging features. Specifically, it adds new features to minimize the residual of the objective function and merges similar features to obtain a compact feature representation and prevent over-fitting. Our experiments show that the model quickly converges to the optimal number of features in a large-scale online setting, and outperforms the (non-incremental) denoising autoencoder, as well as deep belief networks and stacked denoising autoencoders for classification tasks. Further, the algorithm is particularly effective in recognizing new patterns when the data distribution changes over time in the massive online data stream.

### Improved Preconditioner in Hessian Free Optimization

5:42pm

#### Olivier Chapelle, Dumitru Erhan

We investigate the use of Hessian Free optimization for learning deep autoencoders. One of the critical components in that algorithm is the choice of the preconditioner. We argue in this paper that the *Jacobi* preconditioner leads to faster optimization and we show how it can be accurately and efficiently estimated using a randomized algorithm.

## Contributed Poster Presentations

Gaussian-Bernoulli Deep Boltzmann Machine

Kyunghyun Cho, Tapani Raiko, Alexander Ilin

Deep Learning Made Easier by Linear Transformations in Perceptrons

Tapani Raiko, Harri Valpola, Yann LeCun

Importance of Cross-Layer Cooperation for Learning Deep Feature Hierarchies

Grégoire Montavon, Mikio Braun, Klaus-Robert Müller

On spatio-temporal sparse coding: Analysis and an algorithm

Roland Memisevic

Improving the speed of neural networks on CPUs

Vincent Vanhoucke, Andrew Senior, Mark Mao

Unsupervised Structural Learning of Word Topics and Sentence Topics

Jen-Tzung Chien, Ying-Lan Chang

Reading Digits in Natural Images with Unsupervised Feature Learning

Yuval Netzer, Tao Wang, Alessandro Bissacco, Bo Wu, Adam Coates, Andrew Ng

Stereopsis via Deep Learning

Roland Memisevic, Christian Conrad

Learning Topographic Representations for Linearly Correlated Components

Hiroaki Sasaki, Aapo Hyvärinen, Michael Gutmann, Hayaru Shouno

A Deep Neural Network for Acoustic-Articulatory Speech Inversion

Benigno Uria, Steve Renals

Building Low-Dimensional Features from Learned Kernels

Artem Sokolov, Tanguy Urvoy, Hai-Son Le

Beyond Spatial Pyramids: Receptive Field Learning for Pooled Image Features

Yangqing Jia, Chang Huang

Unsupervised feature learning for electronic nose data applied to Bacteria Identification in Blood

Martin Längkvist, Amy Loutfi

Training Restricted Boltzmann Machines on Word Observations

George Dahl, Ryan Adams, Hugo Larochelle

MDDAG: learning deep decision DAGs in a Markov decision process setup

Djalel Benbouzid, Robert Busa-Fekete, Balazs Kegl

A new way to learn acoustic events

Navdeep Jaitly, Geoffrey Hinton

On the Applicability of Unsupervised Feature Learning for Object Recognition in RGB-D Data

Manuel Blum, Jost Tobias Springenberg, Jan Wülfing, Martin Riedmiller

Toward the Implementation of a Quantum RBM

Misha Denil, Nando de Freitas

Results from a Semi-Supervised Feature Learning Competition

D. Sculley

Adaptive Hybrid Monte Carlo with Bayesian Parametric Bandits and Predictive Adaptation Measures

Ziyu Wang, Nando de Freitas

Information theoretic learning of robust deep representations

Nicolas Pinchaud

Unsupervised learning of visual invariance with temporal coherence

Will Zou, Andrew Ng, Kai Yu