Zelda 64 Code Spelunking with Ben Villalobos

home · blog · codex · linux · microblog · music · tech

Long time, no Codex! Need something to listen to while you read? Give some classic Zelda 64 music a listen.

I love this sort of content and I’m always tuning in when I see these sorts of videos or series crop up on YouTube, so I had to share–

For some background, the Zelda 64 decompilation project has been complete for some time. Essentially, the game was written in C before being compiled to machine code and copied onto the game cartridge that we know and love. The decompilation project was a community effort to take that machine code and translate it as closely as possible back into the original C code that it was written in. Compiling the C should produce a byte-for-byte copy of the original ROM.

The newest episode of Ben’s series digging into this code has him working through scene tables, which showcases an interesting use of the C preprocessor that I hadn’t encountered before.

In the ROM, there are a boatload of variables defined that contain information about the 100 or so scenes in the game. These variables, along with other information, are ultimately stored in a SceneTableEntry array named gSceneTable[]. Glancing at the code makes it seem like this is done dynamically during run-time, but it’s actually all done statically using some clever C preprocessing to bake everything directly into the ROM.

The definitions for these scenes are stored in the file scene_table.h. This file starts out like so:

/**
 * Scene Table
 *
 * DEFINE_SCENE should be used for all scenes (with or without a title card, see argument 2)
 *    - Argument 1: Name of the scene segment in spec
 *    - Argument 2: Name of the title card segment in spec, or `none` for no title card
 *    - Argument 3: Enum value for this scene
 *    - Argument 4: Scene draw config index
 *    - Argument 5: ? (Unknown)
 *    - Argument 6: ? (Unknown)
 */
/* 0x00 */ DEFINE_SCENE(ydan_scene, g_pn_06, SCENE_DEKU_TREE, SDC_DEKU_TREE, 1, 2)
/* 0x01 */ DEFINE_SCENE(ddan_scene, g_pn_08, SCENE_DODONGOS_CAVERN, SDC_DODONGOS_CAVERN, 1, 3)
/* 0x02 */ DEFINE_SCENE(bdan_scene, g_pn_07, SCENE_JABU_JABU, SDC_JABU_JABU, 1, 4)
/* 0x03 */ DEFINE_SCENE(Bmori1_scene, g_pn_01, SCENE_FOREST_TEMPLE, SDC_FOREST_TEMPLE, 2, 5)

This is well and good and not very interesting, but what really caught my eye was looking at the definition of DEFINE_SCENE(). Or rather, the definitions of DEFINE_SCENE(), which we can find by taking a look at z_scene_table.c1:

#define DEFINE_SCENE(name, title, _2, _3, _4, _5) \
    DECLARE_ROM_SEGMENT(name)                     \
    DECLARE_ROM_SEGMENT(title)

#include "tables/scene_table.h"

#undef DEFINE_SCENE

// Scene Table definition
#define DEFINE_SCENE(name, title, _2, drawConfig, unk_10, unk_12) \
    { ROM_FILE(name), ROM_FILE(title), unk_10, drawConfig, unk_12, 0 },

// Handle `none` as a special case for scenes without a title card
#define _noneSegmentRomStart NULL
#define _noneSegmentRomEnd NULL

SceneTableEntry gSceneTable[] = {
#include "tables/scene_table.h"
};

#undef _noneSegmentRomStart
#undef _noneSegmentRomEnd

#undef DEFINE_SCENE

And for some final context, I’ll share the DECLARE_ROM_SEGMENT() and ROM_FILE() definitions from segment_symbols.h and romfile.h respectively:

#define DECLARE_ROM_SEGMENT(name)         \
    extern u8 _##name##SegmentRomStart[]; \
    extern u8 _##name##SegmentRomEnd[];
#define ROM_FILE(name) \
    { (uintptr_t)_##name##SegmentRomStart, (uintptr_t)_##name##SegmentRomEnd }

I just found this to be such a clever trick. What’s going on here, as is discussed in Ben’s newest video, is that when the DEFINE_SCENE() macro is first defined, it’s defined as such:

#define DEFINE_SCENE(name, title, _2, _3, _4, _5) \
    DECLARE_ROM_SEGMENT(name)                     \
    DECLARE_ROM_SEGMENT(title)

And then it includes scene_table.h, which is calling the macro we just defined. What’s that macro do? Well it’s calling yet another macro, DECLARE_ROM_SEGMENT(), on each and every line for the scene’s name and the scene’s title, and then ignoring the remaining arguments. OK, so what does that macro do?

#define DECLARE_ROM_SEGMENT(name)         \
    extern u8 _##name##SegmentRomStart[]; \
    extern u8 _##name##SegmentRomEnd[];

Well it’s defining a two arrays for each and every name and title of course – one for that scene’s beginning and ending segment.

So that’s cool and a neat trick, but I haven’t addressed the remainder of the code:

#undef DEFINE_SCENE

// Scene Table definition
#define DEFINE_SCENE(name, title, _2, drawConfig, unk_10, unk_12) \
    { ROM_FILE(name), ROM_FILE(title), unk_10, drawConfig, unk_12, 0 },

// Handle `none` as a special case for scenes without a title card
#define _noneSegmentRomStart NULL
#define _noneSegmentRomEnd NULL

SceneTableEntry gSceneTable[] = {
#include "tables/scene_table.h"
};

#undef _noneSegmentRomStart
#undef _noneSegmentRomEnd

#undef DEFINE_SCENE

What’s going on here?

Well, first it undefines the existing DEFINE_SCENE() macro, before providing a new definition that makes use of yet another macro, ROM_FILE(), defined as such:

#define ROM_FILE(name) \
    { (uintptr_t)_##name##SegmentRomStart, (uintptr_t)_##name##SegmentRomEnd }

Alright, so this is taking every instance of ROM_FILE(name) and replacing it with { (uintptr_t)_nameSegmentRomStart, (uintptr_t)_nameSegmentRomEnd } – thus making use of those variables that we just declared.

In summary, it’s taking the single file z_scene_table.c and re-using it multiple times to string together a series of macros that declare variables using the input to those macros, and then putting those variables inside of an array of SceneTableEntrys here:

SceneTableEntry gSceneTable[] = {
#include "tables/scene_table.h"
};

To give an example of what this looks like by the end, I’ll use these first 3 scenes:

/* 0x00 */ DEFINE_SCENE(ydan_scene, g_pn_06, SCENE_DEKU_TREE, SDC_DEKU_TREE, 1, 2)
/* 0x01 */ DEFINE_SCENE(ddan_scene, g_pn_08, SCENE_DODONGOS_CAVERN, SDC_DODONGOS_CAVERN, 1, 3)
/* 0x02 */ DEFINE_SCENE(bdan_scene, g_pn_07, SCENE_JABU_JABU, SDC_JABU_JABU, 1, 4)

And after we’ve run z_scene_table.c through the preprocessor, we end up with code that looks something like this2:


extern u8 _ydan_sceneSegmentRomStart[];
extern u8 _ydan_sceneSegmentRomEnd[];

extern u8 _g_pn_06SegmentRomStart[];
extern u8 _g_pn_06SegmentRomEnd[];

extern u8 _ddan_sceneSegmentRomStart[];
extern u8 _ddan_sceneSegmentRomEnd[];

extern u8 _g_pn_08SegmentRomStart[];
extern u8 _g_pn_08SegmentRomEnd[];

extern u8 _bdan_sceneSegmentRomStart[];
extern u8 _bdan_sceneSegmentRomEnd[];

extern u8 _g_pn_07SegmentRomStart[];
extern u8 _g_pn_07SegmentRomEnd[];

SceneTableEntry gSceneTable[] = {
    {
        { 
            (uintptr_t)_ydan_sceneSegmentRomStart,
            (uintptr_t)_ydan_sceneSegmentRomEnd
        },
        { 
            (uintptr_t)_g_pn_06SegmentRomStart,
            (uintptr_t)_g_pn_06SegmentRomEnd
        },
        1,
        SDC_DEKU_TREE,
        2,
        0,
    },

    {
        { 
            (uintptr_t)_ddan_sceneSegmentRomStart, 
            (uintptr_t)_ddan_sceneSegmentRomEnd 
        },
        { 
            (uintptr_t)_g_pn_08SegmentRomStart, 
            (uintptr_t)_g_pn_08SegmentRomEnd 
        },
        1,
        SDC_DODONGOS_CAVERN,
        3,
        0,
    },

    {
        {
            (uintptr_t)_bdan_sceneSegmentRomStart,
            (uintptr_t)_bdan_sceneSegmentRomEnd
        },
        {
            (uintptr_t)_g_pn_07SegmentRomStart,
            (uintptr_t)_g_pn_07SegmentRomEnd
        },
        1,
        SDC_JABU_JABU,
        4,
        0,
    }

};

You may be saying to yourself, surely there’s a better way that the game could’ve initialized this? And that may be true! But, for the compiled ROM to match byte-for-byte, the code being compiled must look virtually identical to the final code block above. And how can this be achieved in a way that is in native C, much easier on the eyes (and the margins!), and arguably more accurately represents the intent of the code? By using a whole lotta C preprocessing!

All you really need to know is what the array of SceneTableEntrys, gSceneTable[], looks like after it’s filled with data. And you can reference scene_table.h to see an overview of how each scene is defined. The remaining nitty gritty details might be confusing at a glance, but it’s not essential to understand – but once you do, assuming myself or Ben explained this well enough, you must admit that it’s a clever solution!

Anyhow, I just wanted to talk a bit about that. Obviously I didn’t write the code but it helps me internalize the lessons from it by more actively interacting with and thinking about it, rather than merely passively consuming a YouTube video. And, if you think this is as cool as I do, I would recommend that you clone the git repo and do the same!

git clone https://github.com/zeldaret/oot

Be sure to check out Ben’s YouTube series. This is a link to the playlist; everything I discussed above is from Episode 5.

c.zip


  1. Both neovim’s markdown formatter and my website’s seem to break at different points while previewing this code. I have some suspicions about why that is but I haven’t been able to find a way to fix it. So, sorry about that :( ↩︎

  2. I took some liberties with the formatting, since the post-preprocessed code isn’t normally meant to be human-readable; not that it was completely illegible. ↩︎