Puzzle Dork: RDRQ

Thursday, December 13, 2012

Preventing Score Forgery on Online Leader Boards

Hey guys,

This semester I took a class on computer security with a completely open-ended final project. Since I somehow trick all of my final projects into being about games, the problem my group decided to look at and try to solve was that of fake scores on leader board services such as the iOS Game Center, Kongregate, and others.

As outlined in this article, cheating can be pretty easy and prevalent on these leader boards, particularly on the iOS Game Center. Speaking as someone who was playing Super Hexagon way too much earlier this semester primarily because of competition at the top of the leader boards, I know firsthand how polluted leader boards can negatively affect a game. For example, here's the leader board from when I managed to reach number two in the world:

My Pride

And here it is now:

As you can imagine, seeing a score five orders of magnitude higher than what seems humanely possibly can be a real deterrent for people trying to play legitimately.

With that in mind, what did we do to try to fix this problem? If you want to read a big technical document, check out the paper I wrote on the project. Otherwise, here's the condensed version.

Most exploits work by just communicating with the leader board server pretending to be the game and saying "I got [X] Score!" In most systems that we researched, the server will simply say "Okay!" and that's it. Way too easy.

An obvious attempt to prevent this is to have the game somehow prove that it's the one talking to the server (as opposed to some shady cheater). This is usually done by having the game cryptographically sign score postings with a secret it shares with the server, an approach taken by Newgrounds, Adultswim.com, and others. The problem with this approach is that the secret shared by the game and the server has to be stored on the player's machine. If the player is relatively resourceful, it's not too hard to peek into memory or use a decompiler to find the secret, leak it to the internet, and then we're back to square one.

For our project, we built a system that verifies claimed scores based on a simple assumption. We assumed that a player can be said to have "earned" a score if they are able to produce a series of inputs that will reproduce the score when run on a trusted version of the game's code. This assumption isn't always valid, particularly if players are cheating by playing the game with bots, but it seems to be reasonable enough to prevent straight-up forgeries.

Using this assumption, we built a library for the Flixel game engine that makes use of Flixel's replay system (originally a debugging feature) to record input logs and send them to the score server where they can be verified by a trusted version of the game code.

The server hard at work

The system seems to work pretty well. Depending on the game, the logs sent to the server grow at rates somewhere between 30KB-1MB per hour and can be subject to some heavy optimization. Also, we were able to modify my game Roller Derby Riot Queens (sadly very vulnerable to score forgery) to use the system with fewer than 10 additional lines of code. Not too bad!

Other game engines can easily be expanded to provide this protection. If you're interested in learning more, I'd recommend checking out the full paper.

Thanks for reading!

Wednesday, December 5, 2012

Making Failure Fun and Why it Matters

Here's a lesson I had to learn the hard way about making failure in games entertaining.

It came to me while I was watching players play both Conway's Inferno and Roller Derby Riot Queens, a game that I made in a week over the summer which was kind of a failure itself.

The main idea is that players will be more tolerant of failure when it's entertaining. Kind of obvious when stated like that, but it wasn't something I thought about much before witnessing it. To illustrate, here is a common pattern I observed whenever I watched a new player play Conway's Inferno. Below is the first level from the game.

When confronted with this level and minimal instructions, players would almost always do one of two things. Some players would place a fire on the cells in the center like so:

which of course is the "solution"and results in the player winning the first level and gaining an understanding of the game's mechanics.

The other major thing the players would do is set one of the trees on fire like this:

Which would lead to a big forest fire:

And ultimately the game's minimal failure screen:

But even despite failing, the players who chose the second option would often just laugh and reset the game. My takeaway is that even though they failed, at least they got to burn a forest down, and burning stuff down is at least marginally entertaining.

In contrast, in RDRQ, when the player fails:

The player disappears and a failure screen pops up. Now that's not particularly entertaining, but at least it's better than RDRQ's other failure screen...

Which is the "the level has become unsolvable, but the game doesn't tell you that or limit your moves or anything, so shucks!" screen. Dang, this is basically an embarrassing example of the worst kind of failure a game can present you with. Not only is it not entertaining, it's not obvious at all!

After releasing this thing, I watched players play the same level for ages without realizing there was no hope in the universe of solving it and that they should just hit reset. Needless to say, they weren't too enthused to keep playing after encountering this style of failure.

Now you might ask why I let a game with such an obvious glaring flaw into the light of day. Well the answer is that I didn't really do much play-testing for RDRQ. As I mentioned before, it was made in a week, but what I didn't mention before is that it was made in a work week. So yep, after working on this thing everyday in my spare time after work, I was totally sick of it and wanted to shove it out the door immediately. Also unfortunately, of the play-testers I did have, most had watched me develop the game, so they kinda intuitively knew when they had fubar'd the level. Oh well, at least it was an entertaining failure for me (kinda).

Anyways, one takeaway from these two experiences is that it's usually a good idea to make failure both entertaining and obvious when designing a game, particularly if your game involves a lot of failure. This is something games like Spelunky, Super Meat Boy, and even Angry Birds do a great job with.

My other takeaway is that I basically always always always always need to test my games. Yup.

Thanks for reading!