Gibbs sampling. Figure 7 shows a visualization of these samples. To ensure that the filling-in required top-down information, we compared with a control condition where only a
single upward pass was performed.
In the control (upward-pass only) condition, since there
is no evidence from the first layer, the second layer does
not respond to the left side. However, with full Gibbs sampling, the bottom-up inputs combine with the context provided by the third layer which has detected the object. This
combined evidence significantly improves the second layer
representation. Selected examples are shown in Figure 7.
Our method may not be competitive to state-of-the-art face
completion algorithms using significant prior knowledge
and heuristics (e.g., symmetry). However, we find these
results promising and view them as a proof of concept for
top-down inference.
5. concLusion
We presented the CDBN, a scalable generative model for
learning hierarchical representations from un-labeled
images, and showed that our model performs well in a variety of visual recognition tasks. We believe our approach
holds promise as a scalable algorithm for learning hierarchical representations from high-dimensional, complex data.
acknowledgments
We give warm thanks to Daniel Oblinger and Rajat Raina
for helpful discussions. This work was supported by the
DARPA transfer learning program under contract number
FA8750-05-2-0249.
References
1. bell, a.J., sejnowski, t.J. the
‘independent components’ of natural
scenes are edge filters. Vis. Res. 37,
23 (1997), 3327–3338.
2. bengio, y., lamblin, p., popovici,
d., larochelle, h. greedy layer-wise training of deep networks. in
Advances in Neural Information
Processing Systems, 2007.
3. berg, a.c., berg, t.l., Malik, J. shape
matching and object recognition using
low distortion correspondence. in
Proceedings of the IEEE Conference
on Computer Vision and Pattern
Recognition, 2005.
4. desjardins, g., bengio, y. empirical
evaluation of convolutional rbMs for
vision. technical report, university of
Montreal, Monreal, Quebec, canada,
2008.
5. fei-fei, l., fergus, r., perona, p.
learning generative visual models
from few training examples: an
incremental bayesian approach
tested on 101 object categories. in
CVPR Workshop on Generative Model
Based Vision, 2004.
6. gehler, p., Nowozin, s. on feature
combination for multiclass object
classification. in Proceedings of
the International Conference on
Computer Vision, 2009.
7. grosse, r., raina, r., Kwong, h., Ng,
a.y. shift-invariant sparse coding for
audio classification. in Proceedings
of the Conference on Uncertainty in
Artificial Intelligence, 2007.
8. hinton, g.e. training products of
experts by minimizing contrastive
divergence. Neural Comput. 14, 8
(2002), 1771–1800.
9. hinton, g.e., osindero, s., bao, K.
learning causally linked Mrfs.
in Proceedings of the International
Conference on Artificial Intelligence
and Statistics, 2005.
10. hinton, g.e., osindero, s., teh, y.-W. a
fast learning algorithm for deep belief
nets. Neural Comput. 18, 7 (2006),
1527–1554.
11. hinton, g.e., salakhutdinov, r.
reducing the dimensionality of data
with neural networks. Science 313,
5786 (2006), 504–507.
12. hyvarinen, a., gutmann, M., hoyer,
p.o. statistical model of natural
stimuli predicts edge-like pooling of
spatial frequency channels in v2. BMC
Neurosci. 6 (2005), 12.
13. ito, M., Komatsu, h. representation
of angles embedded within contour
stimuli in area v2 of macaque
monkeys. J. Neurosci. 24, 13 (2004),
3313–3324.
14. Koller, d., friedman, N. Probabilistic
Graphical Models: Principles
and Techniques. the Mit press,
cambridge, Ma, 2009.
15. larochelle, h., erhan, d., courville, a.,
bergstra, J., bengio, y. an empirical
evaluation of deep architectures
on problems with many factors
of variation. in Proceedings of the
International Conference on Machine
Learning, 2007.
16. lazebnik, s., schmid, c., ponce, J.
beyond bags of features: spatial
pyramid matching for recognizing
natural scene categories. in
Proceedings of the IEEE Conference
on Computer Vision and Pattern
Recognition, 2006.
17. lecun, y., boser, b., denker, J.s.,
henderson, d., howard, r.e., hubbard,
W., Jackel, l.d. backpropagation
applied to handwritten zip code
recognition. Neural Comput. 1 (1989),
541–551.
honglak Lee ( honglak@eecs.umich.
edu), computer science and engineering
division, university of Michigan, ann
arbor, Mi.
Roger Grosse ( rgrosse@mit.edu), csail,
Massachusetts institute of technology,
cambridge, Ma.
and Pattern Recognition, 2007.
28. ranzato, M., poultney, c., chopra, s.,
lecun, y. efficient learning of sparse
representations with an energy-based model. in Advances in Neural
Information Processing Systems
(2006), 1137–1144, 2006.
29. salakhutdinov, r., hinton, g.e. deep
boltzmann machines. in Proceedings
of the International Conference on
Artificial Intelligence and Statistics,
2009.
30. salakhutdinov, r., Mnih, a., hinton, g.
restricted boltzmann machines for
collaborative filtering. in Proceedings
of the International Conference on
Machine learning, 2007.
31. taylor, g. W., hinton, g.e., roweis,
s.t. Modeling human motion using
binary latent variables. in Advances
in Neural Information Processing
Systems 19, 2007.
32. tieleman, t. training restricted
boltzmann machines using
approximations to the likelihood
gradient. in Proceedings of the
International Conference on Machine
Learning, 2008.
33. van hateren, J.h., van der schaaf, a.
independent component filters of
natural images compared with simple
cells in primary visual cortex. Proc. R.
Soc. B 265 (1998), 359–366.
34. Weston, J., ratle, f., collobert, r.
deep learning via semi-supervised
embedding. in Proceedings of the
International Conference on Machine
Learning, 2008.
35. younes, l. Maximum of likelihood
estimation for gibbsian fields. Probab.
Theory Relat. Fields 82 (1989),
625–645.
36. yu, K., Xu, W., gong, y. deep learning
with kernel regularization for visual
recognition. in Advances in Neural
Information Processing Systems,
2009.
37. yuille, a.l. the convergence of
contrastive divergences. in Advances
in Neural Information Processing
Systems 17, 2005.
38. Zhang, h., berg, a.c., Maire, M.,
Malik, J. svM-KNN: discriminative
nearest neighbor classification
for visual category recognition. in
Proceedings of the IEEE Conference
on Computer Vision and Pattern
Recognition, 2006.
Rajesh Ranganath and Andrew Y. ng
({rajeshr,ang}@ cs.stanford.edu), computer
science department, stanford university,
stanford, ca.