Laryngitis homøopatiske midler køb propecia edderkop bid retsmidler
The goal of the MPLab is to develop systems that perceive and interact with humans in real time using natural communication channels. To this effect we are developing perceptual primitives to detect and track human faces and to recognize facial expressions. We are also developing algorithms for robots that develop and learn to interact with people on their own. Applications include personal robots, perceptive tutoring systems, and system for clinical assessment, monitoring, and intervention.

  • Introduction to the MPLab (PDF)
  • MPLAB 5 Year Progress Report (PDF)

  • NEWS

    • Receding Horizon DDP: Tom Erez gave a version of this talk to Emo’s lab just before Neuroscience. Yuval Tassa said he may be going to work for Emo. It seems like really good work. The situation is continuous state space. The idea is to use Emo-like methods to start with a trajectory (open loop policy) and iteratively estimate the value function around it, and improve it to a local optimum. Fill the space with a small number of open loop trajectories, and then for closed loop control, always figure out which learned trajectory you are near and follow that. They worked with 15 and 30 dimensional continuous spaces, which apparently is quite high. Receding Horizon is appropriate where you have an implicit value gradient, or at least the local value function is in line with the global value function. In practice I would guess that this is quite often the case. This is something I would want to replicate. The application was a swimming world, which we should look into as a test of algorithms, etc.
    • Random Sampling of  States in Dynamic Programming: Chris Atkeson is apparently a famous control guy at CMU. Tom Erez was very respectful to him. Atkeson’s poster was about discretizing state space using randomly sampled points and some heuristics about whether to keep the points. The points define a voronoi diagram, each with a local linear dynamics, quadratic value function estimator. This is something I woulld like to replicate. The application was up-to-4-link inverted pendulum and swing up problems. Atkeson claims to  be the only one he knows of with such success on 4-link inverted pendulums. This is another system we should test algorithms on.
    • Bayesian Policy Learning with Trans-Dimensional MCMC: “Trans-dimensional” means probability distributions defined on variable-length random variables through statistics of those random variables. These guys were using probability functions defined on trajectories to do trajectory sampling. It seemed like a fancy way to do policy improvement, without much benefit over other methods for doing the same thing.
    • Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs: “Regret” means “difference between optimal average reward and obtained average reward.” It is a common way to measure the loss incurred by exploration of the MDP, and is a currently important topic to RL people. Peter Bartlett showed that a good policy that yields low regret is to keep confidence intervals on the value function and to act as if you will receive the upper bound of your confidence interval. Based on your confidence interval and the amount of time spent acting, there are provable regret bounds.
    • The Epoch-Greedy Algorithm for Multi-armed Bandids with Side Information: The “contextual bandit problem” is a weird MDP where actions affect reward but not state transitions (you float around the world among states X, depending on x you have different reward distributions). Based on the complexity of the learning problem (how well and simply X predicts reward), things can be proven about a strategy “act randomly (uniform) / act greedily” based on how many experiences in this state you have.
    • Learning To Race by Model-Based Reinforcement Learning with Adaptive Abstraction: The Microsoft Applied Games group applied reinforcement learning to a car-driving game. To get it to work, they came up with fancy heuristic ways for discretizing state space on the fly, and combined it with fancy online dynamic programming methods (prioritized sweeping). Eric Wiewiora was very impressed. He said it was one of the first compelling applications of real reinforcement learning that he’d seen. I talked to these guys a bit about applying Emo-style continuous-state-space optimal control to their problem, and they were interested.

    Check out these papers if you get a chance:

    J. Zico Kolter, Pieter Abbeel, Andrew Ng, “Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion.”  Advances in Neural Information Processing Systems. 2007. [ps.gz] [pdf] [bibtex] [slide]

    Umar Syed, Robert Schapire, “A Game-Theoretic Approach to Apprenticeship Learning.” Advances in Neural Information Processing Systems. 2007. [ps.gz] [pdf] [bibtex] [supplemental] [slide]



    Luis Von Ahn gave a fresh talk about using computer games whose product is that humans label images and sounds for free. Lots of things to learn about what Luis is doing.

  • Here is a wikepedia link to Luis
  • Here is a link to the ESP game
  • Cool relationship between directed sigmoid belief nets and undirected boltzmann machines. Particularly the learinng rule for sigmoid belief nets becomes the Boltzmman rule if you use an infinite set of directed layers with linked weights.
  • Keeps emphasizing the form of independence that restricted Boltzmann macines are good at: conditional independence of the hidden units given the observable vector. This reverses the standard ICA approach where you have unconditional independence and the Conditional Random Fields. He called this generative conditional random fields and conditional RBMs.
  • Conditional RBMs from Hinton and Sutskever. Designed to deal with dynamicsl data. Used by Taylor Roweis and Hinton 2007 for generating skeletal animation models.
  • NIPS 2007 paper on learning the orientation of a face using deep belief networks combined with Gaussian Processes
    Here is the Paper

  • Bengio et al 2007 has a paper on how to extend RBM’s to the exponential family.
  • Hinton paper on Stochastic Embedding Using
  • Lots of deep autoencoders pretrained one layer at a time using the Boltzmann machine algorithm then finetuning it using backprop
  • salakhuditnov and Hinton -> Semantic hash mappings.

    Leon Bottou is pushing slightly sophisticated gradient descent methods over batch methods for the case in which you want to make use of tons of data. He shows that they can be much quicker without loss of accuracy, esp. in the cases people like these days (CRFs, convex optimizations, etc.). It may be worth looking into these current gradient methods.

  • This paper is useful to justify the multiple view based approach to multipose face detection. Logothetis, N. K., Pauls, J., and Poggio, T. (1995). Shape recognition in the inferior temporal cortex of monkeys. Current Biology, 5:552–563. It shows existance of scale and location invariant but pose dependent object detection neurons in IT.
  • The human brain has as many neurons as 1 million flies.
  • Anterior IT face cells paper: Desimone et al 1984. We should get a copy of this classic paper.
  • Hung Kreiman Poggio and Dicarlo 2005: After 90 msecs from stim presentation information appears in neurons in IT for object category detection. Information peak by 125 msec.
  • Nature Neuroscience 2007 paper by Anzai Peng Van Essen shows modern data on V2 It may justify the idea of orientation histograms
  • Fabre-Thorpe has very cool experiments on human rapid recognition of visual categories Link to her lab
  • Knoblich Bouvire Poggio 2007. Shunting inhibition model of multidimensional Gaussian
  • Link to Poggios’s group PAMI 2007 paper on bio inspired object recognition
  • Jhuang Serre Wolf and Poggio ICCV paper on recognizing activities. Get a copy.
  • Prefontal cortex may prime the features that are most important for a particular task. LIP would compute a saliency map.

    Roby Jacobs is working on some really cool work on motor control.

  • He described an experiment in which subjects had to control a joystick under differerent noise regimes: no noise, signal proportional noise, signal inverse proportional noise. The optimal trajectories are very different in these 3 cases and humans seem to be quite good at learning these trajectories. Here is a link to the paper
  • He talked about an experiment by Julia something from Germany were the task is reaching under different pay off regimes.
  • He talked about experiments on combination of visual and proprioceptive information. Cited the work by a person at UCSF.
  • He talked about some computational work he is doing in which optimal trajectories are computed as linear combinations of a library of trajectories. This approach may simplify the optimization problem so it becomes a standard gradient descent like problem as opposed to a variational problem.
  • Here is a link to Roby’s work
  • Mike talked about the problem of why we have ganglian center-surround cells if Gabor happened to be the most efficient code. He showed that by adding a very small weight connectivity coinstraint you end up with Ganglian cells instead of Gabors. This begs the question of why we have Gabors in V1.
  • Mike presented a model that attempts to lean a universal dictionary of texture. It is akin to a hierarchical ICA model. Here is a link to the paper:
    karklin-lewicki-nc05-preprint.pdf It may be interesting to generalize this idea to auditory “textures”.
  • Mike presented his cool work on coding auditory signal’s and Gammatones as optimal encoders. He also seems to have extended the work to learn to detect auditory scenes. You can see his work by following this
    link to Mike’s Research

    « go backkeep looking »follow the MPLab on Twitter