BB-Player: a HMM Planning Strategy for Blackjack

Description

We propose a new strategy for blackjack, BB-Player, which leverages Hidden Markov Models (HMMs) in online planning to sample a normalized predicted deck distribution for a partially-informed distance heuristic. Viterbi learning is applied to the most-likely sampled future sequence in

We propose a new strategy for blackjack, BB-Player, which leverages Hidden Markov Models (HMMs) in online planning to sample a normalized predicted deck distribution for a partially-informed distance heuristic. Viterbi learning is applied to the most-likely sampled future sequence in each game state to generate transition and emission matrices for this upcoming sequence. These are then iteratively updated with each observed game on a given deck. Ultimately, this process informs a heuristic to estimate the true symbolic distance left, which allows BB-Player to determine the action with the highest likelihood of winning (by opponent bust or blackjack) and not going bust. We benchmark this strategy against six common card counting strategies from three separate levels of difficulty and a randomized action strategy. On average, BB-Player is observed to beat card-counting strategies in win optimality, attaining a 30.00% expected win percentage, though it falls short of beating state-of-the-art methods.

Date Created
2023-05
Agent