Some thoughts on dynamic music in games

(Originally posted to the Cartrdge blog.)

A long time ago, in the halcyon days of the early 90s, a young Robby (then, Robert) came to realize something while playing Super Mario World. It was a simple detail, yet something that would go on to shape how I view game audio today: when Mario is on Yoshi’s back, you hear jungle drums. Not just a sound effect, or an ambient tone – a new, synced part of the arrangement made a simple change to the feel of the music, and there was no way to jump onto Yoshi to make it sound wrong or out of time. The game’s music adapted to one circumstance of gameplay.
What was happening? As simply as I can explain it, the jungle drums were always there as one channel of the music, but that channel was muted. Jump on Yoshi, that channel turns on. Jump off, mute channel. This was possible then because of the way sample synthesis works on the SNES. Once games moved away from MIDI-style music triggering, memory space and processing power became major limiting factors for multi- track music – but with today’s technology we can achieve the same kind of interactivity with even greater fidelity and flexibility!
So what does that mean for us, as composers and game designers today? It means everything. If we’re writing music with only the same linear mindset as when composing for film, we aren’t capitalizing on the creative opportunity we’ve been handed. Games are interactive, so we should use every tool we have available to make our music enhance the players’ experience.
Here is a more recent example of dynamic music, using a slightly different approach: Destiny.
When fighting a boss, you want the player to be hit with all the right beats at all the right moments. The problem is, one group might take 5 minutes to beat it while another takes 15. That’s too wide of a margin to write a linear piece of music that will match the action – you have little to no chance that the climax of the piece will line up with the boss’ final moments. The solution, in the case of Destiny, is checkpoints. At the beginning of the fight “section A” plays, and if you don’t make any progress that 16-32 bars of music will loop forever. Once you’ve beaten the first wave, drums are added (section B). Halfway through the boss’ health, the next loop switches to section C, and adds a swell and horns at 25% health. Once the boss falls, the next loop switches to a victory fanfare and conclusion. With this system, the length of the music scales to the player’s experience, no matter how long it takes them to beat the boss.
If you’ve read Winifred Phillips’ book “Composer’s Guide to Game Music,” you are already aware of a few different ways music can be interactive in games. The two most straightforward and satisfying methods I’ve recently used are ‘vertically adaptive’ music and ‘horizontally adaptive’ music. Vertical adaptivity is like Mario World, where the layers of instruments are added, or track volumes are adjusted by the gameplay. Horizontal adaptivity is a system like I described in Destiny, where game variables dictate which section of the music is played next and when to progress. These two techniques are often used in conjunction with each other to create some very interesting variations, sometimes even ways of playing your music that you hadn’t initially considered!
When I was composing the soundtrack for Fate Tectonics, I saw a major opportunity to make the music dynamically follow the player’s progress. I set out to make all the different elements of the game into a member of an ensemble, and let the gameplay dictate how those parts would be heard. Each character has a voice (Penelope is a Flute, Hogweed is a Bassoon, etc), and they are arranged in such a way that the music will always sound like it belongs together. You only hear certain instruments when their associated character or element is present in your game. When you use grass tiles a lot, you’ll hear a string accompaniment. If you use water tiles most often, the accompaniment will be clarinets. This layering is the “vertical” adaptivity of the music. Composing in this kind of system takes a lot of time and forethought, but I feel it was definitely worth it.
The arc of the game is separated into three “acts.” For each act, I composed 5 different sections of music, labeled A to E, all of which looped and could transition nicely to any other section. Originally they were played in-game using preset forms, like A-B-A-C-A- D-E (to anyone who is classically trained, this is not a new concept). Eventually the developer used those forms to create a system that determined which section should play next, procedurally (this is called a Markov chain). This is the “horizontal” adaptivity of the music. Progressing from one act to the next served as a major checkpoint, all the while the volume of up to 16 instruments change on the fly at the same time. Even though I only really composed 15-20 minutes of music (if all instruments are at full volume), but because of the dynamic system it’s “performed” a little differently every time you play.
At first glance you might feel a bit overwhelmed – because that’s starting to get pretty complicated, but there are many games that push the envelope of dynamic audio even further. For starters, you can check out Disasterpeace’s January to catch snowflakes and hear a tune at your own pace. Dyad’s whole music system is based on an in-game sequence of instrument samples. In Sound Shapes, the level is a 1:1 visual representation of the audio sequencer. There are a ton of creative ways to make music dynamically performed in-game (here is a cool look at some others). “Vertical and horizontal adaptivity” are just terms we use to try and describe the ways we make our music tie in more closely with gameplay. At the end of the day, my point is to make you consider how we can do this more effectively and maybe inspire you to come up with new and interesting ways yourself.
Try a simple test: instead of bouncing a whole track as a completed mix from your DAW, export the percussion parts separately from the rest of the music (make sure they are exactly the same length). Play both of these files at the same time in the game, and then adjust the volume of the drum track depending on the game state or proximity to an enemy. It’s possible to set up this kind of interactivity in an engine like Unity, or middleware like FMOD without too much of a learning curve (here are some tutorials). For Fate Tectonics the developers had already made a custom mixer system in their engine, so I made changes from an external xml file. It takes different implementation for different kinds of tech, but in the end it’s the same result: dynamic music.
The more you can connect the elements of your music to the interactive elements of gameplay, the more your music will become a part of the gameplay. Don’t be afraid to have multiple variations of a piece based on a player’s choice – file sizes are becoming less and less of a problem. You don’t have to go completely granular, but if you even do a small layer of interactivity, your music and your game will benefit immensely from it, I guarantee it. Let’s use the tools we’ve been handed to compose better gameplay experiences!