Hi and welcome back to the second part of the data format design series. In the first part I've talked about the definition of data-driven design and some of its aspects like format, readability or versioning. In this part I will show you how to keep nosy people out of your resources, bake content and make it interoperable between programs and platforms.

Security

Some game programmers want to keep nosy people out of their game's guts. They may hide files in archives or compile them into executables. This kind of obfuscation will keep off many "sneak thieves" but real hackers and modders won't be bothered by this. Free programs like ResHacker or 7-Zip will open the doors for you. If you want to store your data in archives anyway you can use passwords.

A better way to protect game assets is encryption. Be warned: the runtime impact of transcoding may be an overkill for most of your files. Know what you are doing when you use encryption. It is a complex subject matter with many standards, possibilities and pitfalls. Therefore I don't want to go into detail here. If you are really interested in encrypting sensible game data you should understand the basics, use sophisticated libraries (e.g. Crypto++ for C++) and think twice what you are doing. Otherwise you will waste many man hours and CPU time for mystery-mongering which can be broken nonetheless.

A last hint for people who want their games to be moddable: Modding is the exact opposite of security and information hiding. It makes no sense to safeguard game data against inspection and changes when it should be inspected and changed. Think which way you want to go when you do the initial data design.

Baking

Baking is an optimization technique which prepares data to be loaded faster. The following XML snippet defines the details of an airship hull in Nordenfelt:

<hull>
  <name>Copper Drakken</name>
  <price>5000</price>
  <hitpoints>8</hitpoints>
  <slots>
    <slot x="65" y="170">weapon</slot>
    <slot x="185" y="170">weapon</slot>
    <slot x="125" y="150">engine</slot>
  </slots>
</hull>

The file is well readable for humans due to its tags. It has 217 characters (including white spaces) which have to be parsed by the loader code. Most of them are not used in the game and will be ignored by the parser. When we strip off the tags we get this:

Copper Drakken
5000
8
65 170 weapon
185 170 weapon
125 150 engine

The new file has 66 characters left. That is a size reduction of about 70% and theoretically triples the load speed. The values are still readable but lack the hints for their meaning. The structure for the slot lists is also gone which may lead to problems. Imagine two separate slot lists. There would be no clue where the first list ends and the second starts. A solution would be adding the list sizes before the first slot entries.

The loader code will open the new reduced file, parse its values and will interpret them. Values like the hull name will be stored as string while price or hit points will be stored as integer values. The loader code has to interpret the digit characters into numerical values. We can also save this step by storing the numbers as they will appear in RAM. This binary format is the fastest format for serializing data but lacks readability as well as platform independence. The last drawback is very important if you plan to bake your data for multiple machine architectures. The key word here is endianness.

Games often have to load many little files. Each file must be opened, parsed and interpreted. Opening files can be a significant part of the overall load time. We can save time when we concatenate small files into one big file and load it at once. A common example is the texture atlas where many little textures are combined on one large texture. That is no news. Old games already organized their sprites this way. The same concept can be used for other file formats like binary unit definitions or level files. The important thing here is to combine files which will be loaded together, for example different animations of a single enemy type:

Splatter House Sprites

Click to Enlarge

There are many more approaches how to speed up game data loading. The basic principle is to prepare it for the loader as native as possible. Baking can also be called compilation because programs are nothing else than baked source code. Nordenfelt can load human-readable resources (XML) as well as binary formats. This way I've saved the explicit baking/compiling steps and supported fast load times. The baking happens when the game loads the XML file for the first time and there is no corresponding binary file available or the XML file changed. A kind of JIT compiler so to speak.

Interoperability

Complex games often have editors for levels, characters, vehicles, etc. The editors create game data which will be loaded by the game. We have at least two independent programs in this situation: the editor(s) and the game. Therefore We can easily use different languages for the game code and the editor code. This is a common case because different languages have different advantages (e.g. C++ is the mighty all-rounder, C# has good GUI libraries, etc.). The exchanged game asset files have to be understood by both programs. So we need serialization code in all participating programs which can handle the shared format. When we use XML we have a wide range of supportive libraries in different languages which provide us with loading, saving, validation, and even stub code creation. This speeds up writing the serialization code. The independent nature of XML makes the game data sharable across platforms, programs and networks. Think about modders sharing their creations between Linux, Windows and Mac. Imagine gameplay savings which work everywhere.

Nordenfelt has a built-in editor. This way it can use the same code base for serializing the edited files. Another way for saving work is wrapping the whole serialization code into a shared library and reuse it with language binding. You can use the SWIG tool for language binding.

Conclusion

I hope this short article series gave you some ideas how to improve your game data. There are a lot of problems which can be avoided by simply thinking about the data in advance. XML became my weapon of choice when it comes to proprietary file formats. The size overhead may scare some optimization geeks but it is a small price for the many advantages you gain from this format.

 

Cheers,
Thomas

 

Add comment


Security code
Refresh