Mike Johanson

About Me

I am a Research Scientist in DeepMind's Edmonton office. Lately, I've been working on multiagent RL and environments shared by humans and agents using virtual reality. My most recent paper is on the emergence of microeconomic behaviour in populations of RL agents, in an environment where they learn to produce, consume, and exchange goods. The agents are generic RL agents used in other environments, and learn this behaviour from scratch, with no domain knowledge coded into the agents: they learn this behaviour on their own, through trial and error.In ways familiar to any Microeconomics 101 student, we can study the effects of supply or demand shifts, and we find that the agents often respond in the ways we predict: if one resource is made more rare in the environment, then the agents converge to a higher price. If we cause an increase in the price of a good, then agents respond by trying to produce more of it, and consume less of it.

I defended my Ph.D. at the University of Alberta in the Department of Computing Science in Edmonton, Canada. I study artificial intelligence and machine learning, applied to the types of problems that humans find challenging and intriguing, such as competitive games of skill. I find ways to program (or more accurately, train) computers to perform as well or better than the best human experts. Human experts can spend years studying a game like poker or chess and play the game at a high level of skill, and the clearly defined rules and goals allow us to make computer programs that can compete against them. By pitting human intelligence against artificial intelligence, we can directly measure the progress of our research towards producing computer agents that make good decisions.


Since 2006, I have been a member of the University of Alberta Computer Poker Research Group, and have developed techniques for creating world-class poker agents. This work is largely driven by the Counterfactual Regret Minimization (CFR) algorithm, which is a self-play technique. Instead of directly programming the computer how to act in each situation, we instead set the computer up to repeatedly play games against itself. At each decision, it estimates the value of taking each action, and then improves its strategy a little by choosing better actions more often in future games. Over billions of games (which takes days or weeks), the program improves and tries to limit how much it can lose against a perfect adversary. We can then use its strategy to play against human experts, as we did in our 2007 and 2008 Man-vs-Machine Poker Competitions. In 2008, our program Polaris defeated a team of top human experts in a game of Heads-Up Limit Texas Hold'em, marking the first time that a computer program defeated human professionals in a meaningful poker match.

In January 2015, we used a new algorithm called CFR+ to solve the game of heads-up limit Texas hold'em, and our program Cepheus is now essentially unbeatable by any human or computer opponent: even a perfect adversary who has a copy of Cepheus' strategy and unlimited computation. This was the first human-scale imperfect information game to be solved, and the result was published in Science.

Copies of my research papers and short summaries can be found on my Publications page. If you're looking for one good summary paper of our work, I'd suggest my PhD thesis [PDF] from 2016. This paper-based thesis covers seven core papers that took us from our Polaris agent in 2008 that beat human pros in the Second Man-vs-Machine Poker Championship, to our Cepheus agent that solved heads-up limit Texas hold'em in 2015. If you'd like to keep up-to-date with our group's progress, use our Twitter feed: , or follow me on Twitter:


May 02022: Multi-agent Economics

My most recent paper, on multi-agent economics, just hit arXiv: [Local link]. I've been working on this since around January 2018, shortly after joining DeepMind, and after four years I'm excited to share it.

July 02017: DeepMind

I'm now a Research Scientist at DeepMind's new Edmonton office.

March 02017: DeepStack

Our latest Science paper was just released, on DeepStack: game-changing algorithm for tackling large imperfect information games. In December 2016, it became the first program to beat human pros at heads-up no-limit Texas hold'em. [DeepStack.ai]

September 02016: Joined Cogitai

I'm now working at Cogitai.

January 02016: PhD defended!

I defended my PhD this afternoon (January 14th)! Details and thesis are here: [HTML].