Documentation ¶
Overview ¶
Package cfr implements functions and structures for calculating epsilon Nash equilibrium of an abstracted poker game using counterfactual regret minimization.
References:
Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. Regret mini-mization in games with incomplete information. In NIPS07, 2008. Annotation: Technical proofs of minimization bounds and horrible pseudo code. M.B. Johanson, Robust strategies and counter-strategies: Building a champion level computer poker player, Master’s thesis, University of Alberta, 2007 Annotation: Good general overview and optimization details. M. Osborne and A. Rubenstein. A Course in Game Theory. The MIT Press, Cambridge, Massachusetts, 1994. Annotation: Explanation of a finite, extensive game with imperfect information. J. Rubin, I. Watson. Computer poker: A review. In Artificial Intelligence 175 (2011) 958–987. Annotation: General overview and mention of several bucketing strategies.
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func CFR ¶
CFR returns the counterfactual regret for each action in a node given the expected value for each action, the node's current strategy, and the probability of the opponent reaching the node.
Example ¶
actEV := []float64{-3, 6, 9} strat := []float64{1.0 / 3, 1.0 / 3, 1.0 / 3} fmt.Println(CFR(actEV, strat, 0.5))
Output: [-3.5 1 2.5]
func NewStrategy ¶
NewStrategy calculates a regret minimizing new strategy given a set of cumulative counterfactual regret.
Example ¶
fmt.Println(NewStrategy([]float64{-3.5, 1, 2.5}))
Output: [0 0.2857142857142857 0.7142857142857143]
Types ¶
type Bucket ¶
type Bucket struct {
Classes [1]interface{}
}
A Bucket node nodes represents where information about the cards is observed. It has a child node (an opponent or player node) for each different class that could be observed at that point.
type Opponent ¶
type Opponent struct {
Actions [3]interface{}
}
An Opponent nodes represents where the opponent takes an action. It has a child node for each action.
type Player ¶
type Player struct { Actions [3]interface{} Regret [3]float64 // Accumulative regret wrt each action. Strat [3]float64 // The probability of taking each action. }
A Player node represents where the current player takes an action. It contains the average regret with respect to each action, the total probability for each action until this point, and a child node for each action (either an Opponent, Bucket, or Terminal node). There is an implicit information set associated with this node, which we will write as I(n).
type Terminal ¶
type Terminal float64
A Terminal node is a node where the game ends due to someone folding or a showdown. Given the probability of a win, loss, and tie, it has sufficient information to compute an expected utility for the node that was reached.
In other words, the Terminal node holds the size of the pot when the game ends.