# NIPS 2007: A Game Theoretic Approach to Apprenticeship Learning

Very cool idea. This builds on the work of Andrew Ng in who first introduced the idea of apprenticeship learning. The idea is to learn from a teacher, but instead of simply imitating the teacher (which we can prove will give a similar reward under certain assumptions) we try to do better. If we consider the reward function to be unknown but simplya linear combination of a set of known features of the state, then we can formulate the problem of learning an optimal policy in a game theoretic framework.

Our goal is to find a policy that maximizes the minimum reward over all possible weights on the state features. It turns out there is an algorithm based on multiplicative weight updates due to Freund and Shapire that is after a finite number of iterations will converge to a stationary distribution over policies that in expectation is as good as the optimal policy.

One cool thing about this work is that we can use it in the setting where we don’t have a teacher to learn from.