Unpacking Encrypted Game Files
In the last post I described a way that allowed to unpack content of proprietary format. That format was fairly straightforward with no encryption nor decompression and thus it was possible to grab files that are stored inside without even touching a debugger (we used one tho, but it was not necessary). This time, however, things will get a little bit more interesting. Today we will tackle a game from the Crazy Chicken series, to be more precise - Crazy Chicken Kart 2 (or Moorhuhn Kart 2 in original)
Getting started
First we need to inspect files in the game’s installation directory, to see where assets are stored. We can see that there is a folder named data
which contains a file named mhk2-00.dat
. It is the largest file, which takes about 140MB of space. This will be our target.
When we open the file in hex editor we can see this:
Let’s guess
We can try to guess the structure of the file without using any debugger. Imagine you are a game developer and you are tasked with writing a parser that will unpack and load required files to the game memory. What kind of information is needed?
- File count - it is possible to create a file format without it, for example you can put the file header at the end of the content, and then iterate over elements until you meet EOF (end of file), but generally the file count is a part of a file format
- Location of entry name/id - There must be a way to identify the entry somehow. It can be for example a numeric ID, or a filename just as casual file on the disk
- Way to obtain the entry content location and its length
There are plenty of possibilities how such information can be stored. Let’s see what we can assume just by looking at the previous image. The very first 16 bytes form the string Moorhuhn Kart 2
. We can treat it like a file header to be sure we are dealing with the right file. Multiple filenames can be seen. From the beginning of the first filename, till the beginning of the next filename there is exactly 0x80 bytes of space. This is applicable also for the next files, when we scroll the view down. For now we can assume that this space is dedicated to describing a particular file entry. Inside such fragment, there probably are our two missing elements, offset to the entry content and its length. There are two 32 bit integers. We can see that one points somewhere much further into the file, while the other is a much smaller integer, so the first one is the data offset, the second one - data length.
Now, just after the Moorhuhn Kart 2
string we can see an integer 0x456
. For now assume this is the file count. We see that the first file entry starts at offset 0x40
. The entry is 0x80
bytes len, and there are 0x456
files. So if the 0x456
is really the file count, at offset 0x40 + 0x456 * 0x80
is the end of the last file entry. Let’s check! And indeed. It looks like 0x22B40
is the beginning of data and at the same time the end of file entries. To be even more sure, let’s look at the first file entry and at its data offset, it is also 0x22B40
! So it is even better proof that 0x456
is indeed the file count and we are reading data offset just right.
Unpacking
Summarizing all the information obtained above, we can write a simple python script that will iterate over all the file entries, extract and save their content under appropriate names.
It looks easy, isn’t it? So far it might be even easier than the format from the previous blog post. However, at the beginning of this post I promised that this will be more interesting and I didn’t lie.
What is inside?
Inside the root directory there are two items, config.txt
and a directory mk2
. Let’s take a look inside the mk2
:
items
- Contains different textures and presumably 3D objects of different game itemskarts
- Animations of all the playable characters in gamelensflares
- Textures of flareslevel0X
- Configuration, textures, music, 3d objects and animations for different levelsmenu
- Music and textures to be displayed in main menumisc
- Fonts and HUD texturessettings
- Encrypted configuration of different karts. Looks interesting in terms of moddingsfx
- Different sound effects, collision, engine etctext.csv
- Translation of subtitles in different languages - interesting if you want to translate the game
Examining the results
When we look at the unpacked data it looks almost right. We can see the images and hear sound effects. However, when we open a file with txt
extension, we are presented with gibberish:
It looks like the authors of the game decided to somehow obfuscate the content of text files, otherwise the text could be easily replaced directly in the .dat file, even without unpacking. As there are no checksums in the .dat file, this could lead to cheating (presumably the .txt files contains configuration of speed of different vehicles etc). To read the real content of the file we need to take another approach. As you can see, without knowing the algorithm that is used to decipher the content, it is almost impossible to progress further. Even if the algorithm would be known, we still somehow need to obtain the decryption key. Remember, the game must be able to read and understand the file content, so it implies that it knows the deciphering procedure. At the same time we have access to the game executable, this means we can discover this procedure too.
Reversing the text file decryption method
By observing the unpacked files we can see that there is a directory for each game level, inside each of them there are 4 folders, music
, objects
, settings
and textures8bit
, going deeper, inside the settings
we can find 3 more folders, display
, misc
, objects
, inside them there is a .txt file with the filename that corresponds to the directory name, so inside display
you can find display.txt
etc. As mentioned previously, all the txt
files are encrypted.
Right now we are not interested in the way the game parses the mhk2-00.dat
. We already know it. By making a list of all string references we can spot references to objects.txt
.
This is probably used to grab configuration file of game level we want to play. Let’s put breakpoints on all references and run the game.
After we hit a breakpoint lets put another one on a ReadFile
win api call. There is also a possibility that the whole file was read to the memory at the program start, but this would be a waste of precious RAM (especially in 2003, when the game was created) to load all the level data at once, thus it is more likely that those levels are being read from disk as needed. As we resume the execution we can see that we hit the breakpoint on ReadFile
, the buffer is filled with the encrypted data, just as desired.
Interestingly, the first objects.txt
to load is from level06 directory. Nevertheless we continue our journey to discover the decryption routine. In order to do so we need to put hardware breakpoint on access at the beginning of the buffer filled with encrypted data.
And boom, we landed a function that performs XORs and shift operations, this looks like some kind of decryption routine. As on the first look it is hard to judge what exactly this function is doing, it is a good idea to use the decompiler to do this for us. After opening the game executable in IDA and navigating to the same address (0x0450384) as we seen in the debugger, we are presented with this view:
Not really clean, but after renaming some variables and changing their types it looks much better:
Let’s add this function to our python script. Take a look at the last if
statement in the script:
And check how it works now:
As you can see our effort was worth it! Now the text file is decrypted and we are able to read its content.
Conclusion
Depending on the method used by the game manufacturer, sometimes it is not possible to unpack data files just by guessing the file structure. When we are dealing with obfuscation and/or encryption we need to reverse engineer the executable to obtain the decryption method. Thanks for reading.