Guide:Balancing a Game's Loudness

From VNDev Wiki

The aim of this guide is to provide information on what to look out for when balancing the audio loudness within one's game and what techniques you can use to assure that your game is not too loud, nor too quiet. This guide is based on the Visual;Conference talk I gave in 2021. You can watch the full talk here: https://www.youtube.com/watch?v=T6mcdRp0OdM

Intro

I’m Tim Reichert, I’ve been working as an audio guy in the VN realm for over half a decade, covering jobs such as composition, sound design and audio direction. And as such I have to concern myself with all things audio in video games and one of those concerns is „loudness“. Note that when I’m talking about a game’s loudness I mean everything you hear while playing the game, not just the music or any other elements individually.

You’ve probably had that experience before: You started a game and the next thing you do is lowering the volume. If not right away, then maybe after a while because your ears got tired. These graphs from a questionnaire by Jay Fernandes, a professional composer and sound designer, indicate that a game being too loud is usually a bigger problem than a game being too quiet.

Figure 1: Results of a questionnaire about game loudness. [1]

The questions that have to be answered to understand game loudness are: 1. How is loudness quantified? 2. What loudness levels should you aim for? 3. How do you measure the loudness of your game? 4. How do you adjust the loudness of your game? 5. Is it really that straightforward?

How is loudness quantified?

Nowadays the unit of measurement for perceived loudness is „LUFS“ (also called LKFS sometimes, which is effectively the same). You might also have heard of „RMS“ before, but that one’s a bit dated, so best to just stick to LUFS. There are several ways to measure LUFS like momentary, short term and integrated. Integrated is the one that I, just like most articles on this topic, will focus on: It measures the average loudness from the beginning of the measurement period to the end. LUFS is measured in negative units, and the more negative the number is, the lower the loudness is. That means for example, that -14 LUFS is quieter than -12 LUFS.

What loudness levels should you aim for?

Here’s a big problem: As opposed to television, video games don’t have a standard set yet. In the case of TV you can get into a lot of trouble for not adhering to the norms. For video games, though, you have way more freedom. Look at this analysis of the loudness of some AAA games by Stephen Schappler, where you can see how big the loudness gap can be. You have Bioshock Infinite here with -12 LUFS while Skyrim has -26 LUFS.

Figure 2: Loudness in different AAA titles. [2]

For long form broadcast TV content the standard is -23 LUFS for the whole program in Europe[3] and -24 LUFS for the „Anchor Element“ in America.[4] There are also further restrictions in place for ads, like how the whole ad has to be measured in America, similar to Europe’s long for content model[5] and a maximum momentary loudness in Europe.[6] But you don’t need to worry about those, since a video game would of course fall under long form content. Here we have a definition of Anchor Element from the official American guidelines.

„Anchor Element – The perceptual loudness reference point or element around which other elements are balanced in producing the final mix of the content, or that a reasonable viewer would focus on when setting the volume control.“[7]

So Anchor Element means the element which a viewer or player would concentrate on the most when setting their volume, usually the dialogue. If there’s no dialogue in your game, then in the case of visual novels that might be the music.

Sony, Nintendo and Microsoft, through the Game Audio Network Guild, also recommend -24 LUFS for console games and -16 LUFS for portable games[8] (at first -18 LUFS for the PS Vita[9]). The value for portable games is higher since they can be expected to be played outside where the noise level is higher. The director at Crytek also noted that they had great experiences with -23 LUFS.[10] With such big players in the industry stating that they’d like to adapt the TV loudness norms, there are definitely a lot of things speaking for it. Note that usually there’s some leeway of about 2 units, so if your computer or console game is -26 LUFS or -22 LUFS loud then that’s not such a big deal. As long as you’re within that -24 LUFS range, you’re all good. If you’re developing for a handheld or mobile game, then you can take a LUFS value of -16 as a reference. The Audio Engineering Society recommends not to go beyond -16 LUFS, and as such it makes sense to scratch the „+“ of the ± 2 LUFS grace range that the Game Audio Network Guild recommends.[11]

Aside from loudness there are also recommendations regarding peak levels: They shouldn’t go beyond -1 dB. The peak of an audio file is the highest amplitude of the wave form. If this value goes beyond 0 dB, then this might cause distortion and/or strain on the ears. This especially is a big factor that many game developers overlook, since they don’t necessarily realize that multiple sound sources, like music and SFX and voice acting, stacking up can cause the volume level to go beyond this threshold. It doesn’t say anything about a True Peak for mobile and handheld titles for some reason but I’d recommend to still stick to a True Peak of -1 dB (1 dB = 1 LU).[12]

On top of that a Loudness Range of 15-10 LU is recommended for portable games. This means that the distance between the quietest and loudest parts should be between 10-15 dB/LU. I’m not sure why there is a minimum loudness range, but that’s what the Game Audio Network Guild says.[13] The maximum loudness range makes sense since that way you can make sure that you don’t have many extremely quiet or loud sections. The lack of justification in their document is a bit off-putting in general, but all the more power to you to stray from their path, if you believe that doing things differently enhances the experience of your game.

How do you measure the loudness of your game?

Before you start measuring your loudness, you’ll have to make sure that you do proper volume balancing of the individual assets first. This includes making sure that the voice lines don’t get drowned out by the music or that the SFX don’t jump scare the player (unless on purpose).

Things are starting to get a bit more vague here. I’ve mentioned that Europe would want you to measure the full programme while America would want you to measure the loudness of the Anchor Element for TV. Most articles that deal with the topic of loudness in video games shift more towards Europe’s model. Ideally you would measure the loudness of a full playthrough, but since that would be pretty inefficient if your game is long, most say that you should measure the loudness of at the very least half an hour of gameplay, preferably more like 1-2 hours to get a more accurate number or even more if you have the time. Though that would mean that you could miss possible peaks that go beyond -1 dB, so be careful. That gameplay snippet or snippets you’re measuring should include quiet and loud scenes in proportion, meaning that if you have a lot of quiet puzzle scenes then your gameplay measurement should feature more of those than loud, action heavy scenes. Note that audio signals below a certain threshold are usually ignored by loudness metering software, so no worries if your game has scenes without any sounds, those will automatically be ignored and not factored into the average loudness. Though Visual Novels are in many cases more consistent in their loudness than action games anyway, so you might not even have to do that and you might not have to measure for quite as long. If you’re afraid that someone would still want to turn their game volume up, then aim for your target loudness while defaulting all volume knobs to 75% or 50%. That way players can still easily turn up the volume. Regarding the tool that you can use to measure the loudness: I recommend the Youlean Loudness Meter 2.

1. Download the free version of the Youlean Loudness Meter 2 and install the application. You won’t need to install the rest.

2. Open the software, go to „File“->„Preferences“, change the Driver Type to System Audio. Under Output Device you have to choose your audio device.

3. Start the game.

4. Press the red X at the bottom left („Clear all measurements“) in Youlean Loundess Meter 2 to restart the audio capture.

5. Play the game for a few hours.

6. Check the integrated LUFS value and the True Peak Max in Youlean Loudness Meter 2. Additionally, check the loudness range if your game is a portable game.

How do you adjust the loudness of your game?

Now if you’ve found that your LUFS value is actually way higher or lower than what you were aiming for (don’t forget the 2 LUFS grace range), it’s time to adjust the loudness of your audio assets. Just import them into an audio editor and lower/increase the volume of all assets by the same amount. Physics fortunately make this easier than one would think (as long as you’re not using any middleware on your engine that puts effects or a limiter or compressor on the audio, which might skew the measurements but I don’t imagine there are many here who do that). If you lower all audio assets by 1 dB (which is the same as 1 LU), then the LUFS of your product drops by 1 LUFS. So if the LUFS of your measured gameplay session is -16 LUFS and you want to drop it down to -23 LUFS, then all you have to do is drop the volume of all audio assets by 7 dB/LU.

If your audio is peaking beyond the threshold of -1 dB but the LUFS is at a good level, then you’ll have to either find out what sound effect or voice line made it go beyond the threshold and lower it or compress assets that are making trouble to decrease their dynamic range. Personally I’ve heard, though that the -1 dB threshold that is applied to audio assets in general is a bit overkill and that -0.5 dB is good enough as well, but that’s more anecdotal evidence, so it might be safer to try to stick to a true peak maximum of -1 dB.

Is it really that straightforward?

The short answer is no. Loudness isn’t a straightforward topic, neither in music, nor in television, and especially not in videos games. Games are an interactive medium. Some players will spend more time in louder action sequences while others will spend more time in quiet puzzle sequences. The integrated LUFS value of the play sessions of these different kinds of players will look different. In VNs, though, the variation usually consists of reading speed and choices. These are usually less impactful compared to games with more interaction, so maybe less for you to worry about. Though then again: You could have a loud route and a quiet route. Some players will only ever see one of the routes and opinions on whether the game was loud or quiet could differ.

Connected to that is the difference in play session lengths. The thing is, that measuring the loudness of the program in the case of TV, meaning either the episode or a whole movie, makes sense because you usually watch them in one sitting. In the case of a game, on the other hand, you have multiple play sessions. So if you have a game that has one longer loud sequence that’s for example 2 hours long and your player starts their play session at exactly that place, then their ears are more likely to get tired after those two hours, as opposed to them starting their play session before that and ending their play session in the middle of that loud section. So some play sessions might seem loud to the player while others will seem fine.

Another problem is the difference between measuring integrated loudness of a full play session and measuring an „Anchor Element“. Like I said, most suggest measuring the loudness of a full play session, in television on the other hand some complaints have been brought up that speak against that, which is maybe the reason why American television standards want you to orientate on the „Anchor Element“. A comedy sitcom series that has little dynamic range and focuses on dialogue will have a more consistent dynamic range, while action series that switch between dialogue and loud explosions will have a bigger dynamic range. This would result in the dialogue of the comedy series to be louder than the dialogue of an action series since the loud parts push the LUFS value of said action series upwards.[14] And since understanding the dialogue is pretty important for most, they’ll usually turn up the volume on the action series. So there are also arguments for using the Anchor Element method if you want to go for that. There’s a lot of room for experimentation and finding out what’s best for your game. I personally like the Anchor Element option, but it brings its own problems, like deciding what the Anchor Element is or deciding whether all singular Anchor Element items should be equally loud. example. Even if your average loudness is in a good range, if there are scenes or sounds that are so quiet that the player barely hears them and at another point sounds that hurt their ears, then you’ve still missed your goal: The goal of creating a good listening experience. Another one would be loudness consistency of individual assets, like making sure that all voice acting is about the same loudness, so all voice clips are easy to understand or don’t hurt your ears, no matter whether they are whispered or screamed. I don’t have enough time to go into detail on those topics, but these are definitely things to keep in mind as well.

One goal of finding a common loudness level is to make sure that players don’t have to change their volume when switching to another game. Now you might think: „But if I adhere to these tips but the developer of another game doesn’t and the player switches from my game to theirs or the other way around: Won’t my game feel too quiet?“ Here’s where ear fatigue comes in: If your game is quiet compared to games by developers that haven’t dealt with the topic of game loudness then that’s definitely still your win. Having to turn a game up is way less of a painful experience than having to turn it down or to have stop playing after a while. And like I mentioned, you can default the volume sliders to 75% and players can turn them up at will if they want to. Another advantage is that usually audio has to be compressed to reach a higher loudness level, meaning dynamics get squashed and the listening experience possibly suffers.

Outro

To sum things up: Norms work best if all or most people actually stick to them. But even if you’re one of the few people in the VN scene who end up following the tips that I’ve outlined in my presentation, you’ll still come out as a winner, seeing how many other advantages having your game be quieter gives you. There’s a lot of uncertainty when it comes to loudness in video games and you might find different solutions that make more sense for your specific game. But hopefully I was able to give you the tools and information necessary to create a good listening experience for your game.

Appendix: Loudness in different VNs