Introduction to the MPLab (PDF)MPLAB 5 Year Progress Report (PDF) |

**NEWS**

These guys below presented a Kalman filter model of image motion. The amazing part is that the inputs were actual video, i.e., the observables were pixels. They had to learn the model parameters and then they used that to generate video. I was shocked by the results. There was video of a fountain and the kalman filter kept generating video that truly looked like the real thing. It only used 4 internal states. We really shall look into this as a potential way to model expressions.

Sajid Siddiqi, Byron Boots, Geoffrey Gordon Download

Sajid Siddiqi, Byron Boots, Geoffrey Gordon Download

Ulrik Beierholm, Konrad Kording, Ladan Sham

s, Wei Ji Ma Download

Michael Mozer, David Baldwin Download

Peter Frazier, Angela Yu Download

Nathaniel Daw, Aaron Courville Download

Robert Peters, Laurent Itti Download

Ali Rahimi, Benjamin Recht Download

Pradeep Ravikumar, Han Liu, John Lafferty, Larry Wasserman Download

Alex Smola, S V N Vishwanathan, Quoc Le Download

J. Zico Kolter, Pieter Abbeel, Andrew Ng Download

Umar Syed, Robert Schapire Download

Peter Bartlett, Elad Hazan, Alexander Rakhlin Download

Winner of the best student paper award. The goal is to see if we can tease out an individuals underlying distribution of objects in a specific category. For instance if x is a vector of joint angles and limb lengths and c is a category such as giraffe, can we estimate p(x|c). This paper takes a very original approach using an MCMC method based on metropolis hastings. It turns out that a 2 alternative forced choice model of called the Luce decision rule where subjects choose option x versus option with with probability p(x)/(p(x) + p(y)) is identical to a particular Metropolis hastings acceptance rule due to Barker. Therefore, we can treat a series of successive 2 AFC tasks as MCMC with a metropolis hastings update and a barker acceptance rule. The stationary distribution of this will be the underlying distribution we want to recover (p(x|c)).Experiments were performed on trained data. Subjects were able to recover gaussian distributions of various means and variances.Subjects were then asked to form a model of stick figures for specific animal categories. The results show that the underlying distributions inferred look pretty similar to the animals in the category.

Tekkotsu is a opensource educational robotics platform developed at CMU. They design low cost robot prototypes as well as well designed C++ robotic library and working environment. So that students could learn to program a robot by writing very high level c++ code rather than dealing with vision motor control etc. themselves.

I asked the author’s opinion about MS robotic studio. He replied with two major drawbacks

(1) closed source

(2) the controller need to be run on a PC which is not convenient for mobile robots and communication between PC/robot may need substantial amount of time.

Very cool idea. This builds on the work of Andrew Ng in who first introduced the idea of apprenticeship learning. The idea is to learn from a teacher, but instead of simply imitating the teacher (which we can prove will give a similar reward under certain assumptions) we try to do better. If we consider the reward function to be unknown but simplya linear combination of a set of known features of the state, then we can formulate the problem of learning an optimal policy in a game theoretic framework.

Our goal is to find a policy that maximizes the minimum reward over all possible weights on the state features. It turns out there is an algorithm based on multiplicative weight updates due to Freund and Shapire that is after a finite number of iterations will converge to a stationary distribution over policies that in expectation is as good as the optimal policy.

One cool thing about this work is that we can use it in the setting where we don’t have a teacher to learn from.

Structure prediction is the case of classification problem where labels could be structures (ex. trees) rather than binary labels.

The traditional way for structure prediction is to break the structure into small pieces and the feed these pieces to a classifier. However, such “breaks” will also break the “structural relation” in the data. The structural prediction do take structure into account thus archives slightly higher accuracy.

In our cert project. If we treat the whole-face expression as a structure rather than individual AUs, that might fit into this framework.

software : SVM^{struct}

related model: conditional random field.

Multiple Instance Active Learning

-key idea: we have a series of bags that are either positive or negative. Each bag contains a series of examples. We know that each negative bag contains no positives and each positive bag contains at least one positive. We assume the bag level labels are easy to obtain. This work gives several strategies for selecting individual examples in the positive bags to query for labels. These strategies are more or less heuristic, but the results are strong. This is the same setup as the weakly supervised object detection problem.

Learning Monotonic transforms:

- really cool. Simultaneous learn an svm and a monotonic transformation on the features. These monotonic transforms can model saturation effects and other nonlinearities.

Variational inference for network failures:

- interesing application of variational inference. Very similar to the idea of predicting a set of diseases from a set of observed symptoms. The system is an expert system in that we use a noisy or model for the expression of a symptom given a disease where the noisy or parameters are given. Some additional tricks are used such as putting beta priors on the individual diseases.

Learning Thresholds for Cascades using Backward Pruning (Work with Paul Viola)

A cool idea for picking thresholds. Train a classifier on all examples. At the end select all positives that are above a certain threshold and now train a series of cascades. The threshold selected at each level of the cascade sould guarantee that none of the positives that would survive to the end are removed.

Audio Tags for music: similar to work at UCSD except uses discriminative instead of generative framework. Also, they test on a much harder dataset and use the tags to reproduce a notion of artist similarity induced by collaborative filtering. The people who did this work are aware of the work at UCSD.

Language recognition from phones:

They put a phone recognizer as a front end to an n-gram model for predecting which language the speech is from (multiclass: e.g. english, spanish, german, etc.). A pruning algorithm is used to prevent combinatorial explosion in the number of features. Just thinking out loud, but is this a possible justification for the loss of discrimination of certain phones that are not part of your native language?

« go back — keep looking »follow the MPLab on Twitter