FarkleBot: An experiment in tabletop game AI

Six six-sided dice
(Photo credit: Flickr user philtoselli cc-by-nd)

A favorite activity for my partner and I is visiting brewery taprooms; often, we’ll combine a trip with some kind of casual game, (usually Odin’s Ravens 2e) but we don’t always haul that out with us. A few years ago we grabbed a game off the shelf at Denver Beer Company called Farkle, and we gave it a try.

It’s quite straightforward: Components are only 6d6, and players take turns rolling all of them, then drafting dice out of the most recent roll which meet scoring criteria, then either rolling again or banking their points and passing to the next player. There’s a cool press-your-luck angle, because if there are no score-able dice in a throw, the player has encountered the titular “Farkle”, and their turn ends with zero points[1].

A conversation I encounter with some regularity in the board game space (and rightly so) is the notion of automated testing — codifying all the rules to a design into software, and running repeated simulations to try and ferret out mathematical edge-cases that playtesters might miss. In fact, from a certain viewpoint, it sometimes feels absurd that this isn’t a more common practice.

On the other hand, something has always made me skeptical of the utility of doing so — correct, working code isn’t exactly easy to write, plus, in-progress board game rules are usually in a continuous state of flux during testing. Not to mention, playing games requires a lot of abstract thought, so a simulation which plays well is even harder. And, this type of stochastic simulation reveals nothing to a designer about the actual fun, so its overall utility is limited to finding math issues.

With all this in my head, during our most recent Farkle throwdown, I realized that this game could be an easy way to try it out — each turn is discrete, so there’s no interaction between players, and in fact, within a turn, each roll of the dice is discrete, because drafted scoring dice don’t combine with previously-drafted scoring dice[1].

BravoBot doesn’t fuck around with scores less than 500.

I coded it up one morning, just to see how much work it would take; it was only a few hours to get not only the game rules built, but also to create a harness for running various ‘Bots’ with distinct strategies, making drafting and passing choices based on given dice rolls.

The source is up on github; it needs a recent version of ruby to run, and I haven’t tested it on other system, but it doesn’t have any external dependencies (yet) so I suspect its not hard to get running. There are instructions in the readme for building additional Bots, and if anyone opens a pull request with some interesting Bot behavior, I’m happy to review and merge it.

Currently, when running a bot, the final outcome is a count of turns it took to achieve a winning score of 10,000 (Farkle is decidedly ‘multiplayer solitaire’, given that players don’t impact each other at all, so this seems like a fair proxy for how “good” a Bot is[1]). Next, I might set up a runner that collects stats from multiple games, because the random nature of the game has a massive impact even for a hypothetically optimal strategy. With that, I could rank Bots by some combination of “mean” and “variance” of number-of-turns-to-victory. Which I think (my stats skills are weak…) would make it possible to rank strategies by “Strategy X beats Strategy Y N-percent of the time”. I might need help with that part 🙂

Overall, I would say, given that I was on the fence before about the value of building a code-based simulation of a tabletop game for testing purposes, this exercise has only entrenched my stance — it was pretty easy to do in this case, but the ruleset is almost laughably trivial, as is analyzing state to choose actions to take. So i’m still firmly on the fence for more complex cases. i.e. Any other game.

You can go check out the code on Github; as I was developing, I was being pretty intentional with my commits, so if you’re new to programming, and want to see how someone who generally passes for a senior engineer begins and iterates complexity into a project, the commit history might be interesting to you as well.


[1] Yes, I’m aware that there are some other rules which involve both more complex scoring patterns, some which create player interaction, and some which preserve state between rounds. The first few times we played, the rules we found didn’t include these, so that’s not how we play. PRs are open on the repo if you feel compelled to add those rules.