Systems Intent

Dialogue can be the focus of many games, if robust dialogue is the foremost feature then it’s important to create or use a system that can account for the plethora of data and logistics behind simple game dialogue.

Some games, like Morrowind use dialogue for linear quest interactions, and Morrowind houses a healthy amount of quests. In this scenario the game needs to track multiple quest progressions, though that’s the end of it, linear state. OpenMW, a fan re-write of the Morrowind engine has fairly legible quest code available here.

Telltale’s “The Walking Dead” uses a branching paths story in a hierarchic structure, one choice always leads to a new choice, never repeating. The hierarchy inherently provides structure used to create branches and reduce branches back into a linear path.

Our team’s game “Vessels”, we had a finite cast of characters and a spider-web of dialogue for the player to unravel. We used Twinery to draft our story as the node-based system and robust state machine handles our cyclical, evolving story. In Unreal we developed our own virtual machine to handle dialogue, sewing programming and dialogue into one text file. With a virtual machine like this we could create branching paths and out of order execution for player progression.

When designing any system it’s important to only pay for what you use. Having a firmly low scope means saying no to some requests, it means building a system that simply can’t do everything. By design the limitations can bring benefits, you can make assumptions of what designers are trying to do and automatically react or properly throw errors. It’s very easy to believe you’ll need the most robust system, this rarely reduces time spent coding for both the tools developer and content creators.

It’s very important to make sure your system can be debugged, even those not using virtual machines or a seemingly simple system. Error checking is little work when automated and a major headache if left invisible.

Vessels and Virtual Machines

Vessels originally planned for a open world approach. The tool ink seemed promising, but didn’t fit our proposed design requirements. Looking at it now we seem to have implemented most of ink’s features and architecture with more legible syntax, if it worked with Unreal and I dug deeper I’m sure we would’ve picked this tool. Twinery is another tool we would use for our initial draft. Twinery doesn’t export to a easy to parse format, nor work out of the box with Unreal, it was useful to find out what features we would need.

I didn’t look into Rumor at the time, but it’s a fruitful dialogue system with fantastic, simple syntax.

The benefit of a virtual machine like ink or the “Airlock Dialogue File” (ADF) is allowing designers immense control over pathways and state changes. It’s important to only cater to those two objectives, too many features can bog down your dialogue systems ease of use or overcomplicate simple functions. ADF for example only stores boolean flags, no integers or math allowed.

When restraining scope and curbing potential downsides virtual machines flourish, ADF handles 6,923 lines of dialogue with 7,258 written operators for flow control. This would’ve been over 7,258 nodes in blueprint without this system to ease reading and writing; not to mention cut out Unreal’s atrocious boot time.

VM - Data Structures

A virtual machine uses programmer-defined instructions to operate on. The performance of virtual machines is always slower than raw machine code, yet can be much faster than most interpreted languages if the instructions are small and specific.

We will define game or dialogue related instructions. ADF has no mathematics built-in, every operation is built to aid in dialogue cosmetics and flow. Code samples like below will be shortened and UE4 types will be replaced with stl variants.

As reference I’ve attached this sample of dialogue the ADF parses and uses in our game.

# Question_Suicide
Yes... I am. And... I've asked you to not talk to me about this.
    ~name: Esme
Peyton, please... I respect the subjects you don't want to discuss.
Please do not bring Marv up with me.
This subject seems to upset her most...
    ~name: Entity
    ~interest: Esme_Upset by Marv's suicide
We can use this.
    ~link: ASPEYTON_QUESTION

Each piece of dialogue and function (the ~ part) is stored as a “byte” in the machine. “Byte” used lightly as we keep the whole string in this bytecode rather than referencing it elsewhere in memory.

struct Byte
{
    enum EType_t
    {
        // COSMETIC
        SAID_TEXT,
        CHOICE_TEXT,
        NAME,

        // CONDITIONALS
        CONDITIONAL,
        INVERSE_CONDITIONAL,
        OR_CONDITIONAL,
        ALREADY_READ,
        AS_CREW,

        // MUTATORS
        SET,
        UNSET,
        INTEREST,
        SPECIAL,

        // FLOW CONTROL
        ALTER_START,
        LINK_TO,
        BREAK_TO,
    } function;

    std::string text;

    unsigned originLine;
    unsigned indent;
};

So we create each byte with two main variables for our machine to operate on. Each EType_t tells the VM what to do with the byte’s text variable, for merely displaying, a SAID_TEXT operator will print the text variable on-screen. The SET function will store a value of true in the VM, with text as the key, for later state retrieval.

To help understand how we craft our Byte I’ll take the last example and write it as a array in C++ using the struct Byte.

Byte Question_Suicide[] = {
    {Byte::SAID_TEXT, "Yes... I am. And... I've asked you to not talk to me about this."},
    {Byte::NAME,      "Esme"},
    {Byte::SAID_TEXT, "Peyton, please... I respect the subjects you don't want to discuss."},
    {Byte::SAID_TEXT, "Please do not bring Marv up with me." },
    {Byte::SAID_TEXT, "This subject seems to upset her most..."},
    {Byte::NAME,      "Entity"},
    {Byte::INTEREST,  "Esme_Upset by Marv's suicide"},
    {Byte::SAID_TEXT, "We can use this."},
    {Byte::LINK_TO,   "ASPEYTON_QUESTION"},
};

If we track the last read SAID_TEXT operator and re-feed the list at that start point; making SAID_TEXT the “end” for an iteration. We can use this pattern to wait for user input “click to continue” before iterating to the next SAID_TEXT.

This example code conversion is mostly accurate with the caveat that we can’t create a custom named Question_Suicide array for what’s declared in a text file. We opted to use a std::map <std::string, std::list <Byte>> to achieve this run-time array creation, where the std::string key is “Question_Suicide” and the rest, is the byte-string in linked list form.

When actually implementing I’d use std::unordered_map and std::forward_list for simpler insertion complexity.

Byte-strings can take form of many data structures. Arrays will be faster to operate on but potentially slower to build. Linked lists will be easy to build and slower to operate. For this reason our dialogue system builds linked lists at the start of the game and operates over them during dialogue. Maps are surprisingly fast for how feature rich they are; we use maps to title and track the dialogue byte-strings.

Dialogue will be running our virtual machine sparsely, only when the player clicks through text, so performance at vm-runtime is typically wasted effort.

VM - Operation

The simplest virtual machine is a function taking some container to operate over; I’ll demonstrate the easiest, hardest to work with implementation.

bool run_string (const std::forward_list <Byte> data)
{
    for (auto & i: data)
    {
        switch (i.function)
        {
        case Byte::SAID_TEXT:
            std::cout << i.text << std::endl;
            break;
        case Byte::LINK_TO:
            run_string (data_lookup (i.text));
            return true;
        default:
            std::cout << "unhandled function!" << std::endl;
            break;
        }
    }
    return true;
}

This function may be enough for you, but I’d like to leverage objects to make some relationships between our dialogue and actors. Firstly any data from our byte-strings will be lost without somewhere to keep it. Second this operates an entire byte-string at a time, while useful for more programmatic cases we need to stop, and wait for the player to continue the reading. By making a class to house and run our dialogue we can attach this to actors and directly associate text with them.

class VM
{
public:
    // loads text file into our "dialogue" variable
    VM (const std::string & filename);

    using byte_itr = std::forward_list <Byte>::const_iterator;

    // operates based on the playHead
    void run_head();
private:
    // variables can be anything!
    std::unordered_set <std::string> flags;

    // what a mouthful!
    std::unordered_map <std::string, std::forward_list <Byte>> dialogue;

    // track dialogue progression
    byte_itr playHead, playHeadEnd;
};

In UE4 finding a place for persistent data may be daunting. Creating a basic C++ “Game Instance” class with accessible data will persist through the game, just remember to reset during a game over or save/load. With this your VM function will pull and push data from UE4’s global variables, like the game instance object.

For this document I’m going to continue with the custom VM class definition.

When managing dialogue you’ll likely want to display text, change state, and move the play head forward. We benefit from separating this into at least two functions, you might want to display text more than once, or skip ahead via a fast-forward key.

It’s also important to find out what data needs to come out of your VM operations. For displaying text it’s usually just a std::string, while running could return full state, a error code, or nothing! My advice here will be to try and catch your errors while loading dialogue.

Conditional statements probably operate the same when running or reading, so we can move this to it’s own function as well.

class VM
{
public:
    VM (const std::string & filename);

    using byte_itr = std::forward_list <Byte>::const_iterator;

    // returns if the run was successful
    bool run_head();

    // returns text to display, may be empty
    std::string read_head() const;
private:
    // returns a change in state
    bool byte_state (bool inState, const Byte & in) const;

    std::unordered_set <std::string> flags;

    std::unordered_map <std::string, std::forward_list <Byte>> dialogue;

    byte_itr playHead, playHeadEnd;
};

Implementing these functions is very similar to the stand alone function I wrote before. The ADF structure requires byte-strings to start with SAID_TEXT, we specify this so we can use SAID_TEXT as a sentential to halt. When the VM picks back up we can grantee playHead is either SAID_TEXT or playHeadEnd.

Re-writing the standalone run_string() example function for our class certainly looks larger, but the logic now checks for our sentinel and holds the play head properly.

bool VM::run_head()
{
    if (playHead == playHeadEnd)
        return false;

    // our run_head() should always start on a SAID_TEXT, or end
    playHead++;

    while (playHead != playHeadEnd and playHead->function != Byte::SAID_TEXT)
    {
        switch (playHead->function)
        {
        case Byte::SAID_TEXT:
            std::cout << playHead->text << std::endl;
            break;
        case Byte::LINK_TO:
            {
                const auto headText {playHead->text};

                // be weary, unordered_map::at() will throw
                // if headText isn't a valid key!
                playHead = dialogue.at (headText).begin();
                playHeadEnd = dialogue.at (headText).end();

                return true;
            }
        default:
            break;
        }

        playHead++;
    }

    return true;
}

VM - File Reader

Constructing our byte-strings by reading the text files first requires formatting. I really like markdown so I based ADF around it. A line-based formatting approach makes for easy programming, and can interfere with stylized/rich text formatting like RTF.

When creating ADF’s formatting we only had to consider two parts; how to assign byte-strings a name for later reference, and how to specify the type per byte. Naming byte-strings was easy, any special character(s) to make a line or block stand out will do. Now specifying byte type certainly needs to be a key-value pair with some short hands for common functions.

# Esme_Success
I'll be right there.
    ~name: Esme
Esme is on her way.
    ~name: Entity
    ~set: dl_someone_enroute
    ~set: dl_esme_spoke
    ~special: sound_off

# Rakesh_Success
Patience, if you please.
    ~name: Rakesh
Rakesh is on his way.
    ~name: Entity
    ~set: dl_someone_enroute
    ~set: dl_rakesh_spoke
    ~special: sound_off

This example shows we used ‘#’ to mark the start of a new byte-string. We enclose byte types in ~function: and the byte text as anything afterwards. Lines without a ~function: at the start are treated as ~SAID_TEXT:, our shorthand.

Reading our line based files will look like this, proper error checking will massively expand this constructor, but it’s well worth it.

VM::VM (const std::string & filename)
{
    std::ifstream infile (filename);
    if (!infile.is_open())
        throw std::runtime_error {"couldn't open file! " + filename};

    std::string line;
    std::forward_list <Byte> * writingTo = nullptr;
    while (std::getline (infile, line))
    {
        line = trim_whitespace (line);
        switch (line [0])
        {
        case '#':
            // we create our lists backwards with push_front()
            if (writingTo != nullptr)
                writingTo->reverse();

            // name new byte-string
            writingTo = &dialogue [line.substr (1)];
            break;
        case '~':
            if (writingTo != nullptr)
            {
                // is function of name:
                const auto colonPoint {line.find (':')};
                const auto functionName {line.substr (1, colonPoint-1)};
                const auto functionText {trim_whitespace (line.substr (colonPoint))};

                writingTo->push_front (Byte {functionName, functionText});
            }
            break;
        default:
            // is function SAID_TEXT
            if (writingTo != nullptr)
                writingTo->push_front (Byte {Byte::SAID_TEXT, line});
            break;
        }
    }
}

This format is reliant on the first character on a line, this makes it easy to expand for more shorthand or unique functions. Keep in mind this sample code doesn’t check for any potentially extreme errors, like reading the same # name. Validating function inputs will be the bulk of your error checking, it’s very important to highlight any potential syntax or logic errors.

I’ll write out the trim_whitespace() function, and the string based type Byte constructor for completions sake.

inline std::string trim_whitespace (const std::string & in)
{
    unsigned index {0};
    while (index < in.length() and std::isspace (in [index]))
        index++;

    if (index >= in.length() or index == 0)
        return in;

    return in.substr (index);
}

Byte::Byte (std::string type, std::string value)
    : text (value)
{
    struct
    {
        std::string key;
        EType_t value;
    } static const tostr[] = {
        {"set", SET},
        {"special", SPECIAL},
        {"name", NAME},
    };

    // tolower the entire input
    std::transform (type.begin(), type.end(), type.begin(), ::tolower);

    for (auto & i : tostr)
        if (i.key == type)
            function = i.value;
}

Conclusion

When designing a system it’s of utmost importance to consider the following:

  1. Error Checking
  2. Legibility
  3. Expandability

While programming in general the ability to debug code should be in the back of your head. While designers are using your system it should be a forced, implicit part of the system. At any point stop and think “How could this function be misused?”, “What if this is misspelled?” and try to flag that case. Your system should check designer’s code vigorously.

Writing your own language puts you in a unique situation of having to teach it. Do yourself a favor and make it simple and transparent, use as much English as possible. “Syntactic sugar” should be avoided, too many enigmatic percent signs and asterisks only cause confusion and a trip to the manual, even for yourself. I’d recommend making use of braces or parenthesis if applicable, white space can be difficult to debug and program for.

More functionality will always be around the corner, be ready to quickly try out ideas. Make sure you can account for functions with multiple parameters. Add an escape character in case your special characters are needed in-dialogue.

Creating a virtual machine can be all the fun of making your own language without all the hassle of compiler architecture. Good luck, try to make the most of it!