Level Four: Narrativity

Process Phase Level Four

Beat 1. Arrangement

Contrast & Congruence

As discussed, there are four perceptual domains. For any given dramatic unit, there will never be less than two domains involved. Visual content and auditory content are always on board—even if they’re absent. In a video game, the absence of audiovisual content is the content (utter silence, e.g., or a black screen while the player character is in complete darkness or blindfolded).

As visual, auditory, kinesthetic, and mythological content can all be employed for a single dramatic unit, knowledge about how these domains work together, relate to each other, reinforce each other, or interfere with each other will strongly affect design decisions in the Narrativity territory.

Naturally, most people assume that visual content has the right of way, but this is not what has been observed. There’s been quite a lot of research into this, and for the mood that is experienced in a dramatic unit, sound almost always overrides even the strongest visual expression. The most exuberantly joyful amusement park ride will leave you in tears when set to Barber’s Adagio for Strings and the most innocuous family vacation slide show will deliver you a glimpse of the apocalypse when accompanied by Ligeti’s Requiem or Penderecki’s Polymorphia. If you’re still in doubt, watch the last minute of Maximilian Dood’s YouTube review of the universally panned Aliens: Colonial Marines, which combines action footage from that game with the “Yakety Sax” theme from The Benny Hill Show to the most ignominious effect.

Which, already, is an important design tool for this territory. Playing with incongruity by using sound that doesn’t match the visual or narrative subject matter can achieve effects similar to the Kuleshov montage effect. When two different pictures are juxtaposed, Kuleshov showed, people interpret the meaning of the first in dramatically different ways dependent upon the nature of the second. That way, the second picture can “define” the first by providing cues for interpretation and solve ambiguities (or create them in the first place, as the case may be). Indeed, as a study by Baranowski and Hecht on “The Auditory Kuleshov Effect” indicates, juxtaposing visual content with different auditory content does the exact same thing. The Kuleshov effect works equally well when sound takes the role of the second picture.

Does that mean that visual content is the runner-up in the domain hierarchy, right behind auditory content? No, it isn’t—narrativity related to mythological content can, and does, strongly alter the way how visual content is perceived.

To explain the reason for this and its practical use for game design, we must take a brief detour into media psychology and look at agenda-setting and priming. Both work in tandem and are related to memory and how we recall information. It’s a three-step mechanism. First, certain aspects of a topic are highlighted over others. Then, these aspects are more easily recalled. Finally, with regard to this topic, decisions or certain forms of behavior in general are then primarily guided by our opinions about these highlighted aspects, and not about other possible aspects or the entirety of our knowledge about that topic. While priming effects have been milked for all it’s worth in scientific studies, not all of them, especially those with more headline-grabbing results, can be reproduced in follow-up studies. But the foundation is solid, and the first study on intentional priming by Iyengar, Peters, and Kinder in 1982, how media audiences evaluated the presidential performance of Jimmy Carter, has held up to scrutiny pretty well. Finally, perception itself is associated with memory. In the so-called sensory memory, a kind of ultra-short-term memory, the vast majority of sensory information is discarded within time spans ranging from 15 milliseconds to 2 seconds, or 3 seconds at most for auditory content, according to current research. Processed toward less volatile types of memory are only those pieces of information that we deem valuable, and we deem valuable what we either are focused on or can easily focus on. Which, unsurprisingly, will more likely be things attached to other things that we already know and can easily recall!

That’s the abridged version, of course, but it’s all we need. Here’s the point. You can use your mythological content from setting, location/environment, backstory, and lore, or hand-picked parts of it, to prime your player toward what they should recall, expect, and focus on. It won’t override your auditory elements, most of the time, but it will influence how the player perceives your visual content, how they make decisions, and how they act on the basis of these perceptions. We will meet agenda-setting and priming again in Level Six: Integral Perspectives II.

In relative terms, we now have a hierarchy with the auditory domain on top, followed by the mythological domain, followed in turn by the visual domain. But where is kinesthetic content located within that hierarchy? That, alas, isn’t altogether clear. Partly because this specific question hasn’t been studied in-depth, partly because we have yet to unlock its true potential for video games, most likely to be advanced by AR or VR games. Also, the power of the camera is somewhat diminished when compared with the power of cameras in movies. There, striking narrative qualities or properties can be established through special angles and special movement, including zero movement—think of Sergio Leone’s extreme close-up Italian shot, Yasujiro Ozu’s low-height Tatami shot, or Stanley Kubrick’s interminable tracking shot. Outside of cutscenes, this seems very hard to accomplish in video games without cutting back on player agency. Thus, for all we know at this point, kinesthetics in games seems to have the weakest influence from our four perceptual domains.

To wrap it up, the domain hierarchy with regard to player perception is: auditory, mythological, visual, and kinesthetic content, from top to bottom. Not always, certainly, but in most cases.

You need to know this hierarchy if you want to work with elements that are in dissonance with each other within a dramatic unit: contrasting, conflicting, or even contradicting signals; incongruity, ambiguity, or uncertainty; and any other type of friction with the potential to create cognitive dissonance. Knowing this hierarchy, you can mix and match your domain elements with much better precision to produce your intended effect.

But you also need to know this hierarchy if you want to have your elements work in consonance with each other in a dramatic unit. Consonance doesn’t sound as exciting as dissonance, but there are two important applications: enhancement and clarification. Let’s look at both of these, starting with enhancement.

You might have heard about what has become known as the “misattribution of arousal” effect, an effect attested to by various studies. The most widely known of these studies is Donald G. Dutton & Arthur P. Aron’s “bridge” experiment. If you haven’t heard about it already, its title will give you a hint: “Some Evidence for Heightened Sexual Attraction Under Conditions of High Anxiety.” An attractive “interviewer” met unwitting test subjects on a suspension bridge with what the authors call “many arousal-inducing features”—think Indiana Jones and the Temple of Doom, with a 230-foot drop and wobbling and tilting and so on. The equally unwitting test subjects from the control group met their “interviewer” on a much lower, solid wood bridge. As you can probably guess, the test subjects from the suspension bridge were significantly more attracted to the interviewer than the control group, and this result was enforced by other, methodically more rigorous (but less spectacular) tests within the same study.

In general terms, strong emotions of one kind can increase the attraction to something completely different. That’s the “misattribution of arousal.” This works not only with fear, as in the bridge example. Evidence from other studies suggests that misattribution of arousal also works with emotions like euphoria or anger, and even physical exertion. What’s more, the misattribution not only supports positive attraction, but also its opposite—negative attraction.

As you can see, all this opens up a vast array of possibilities. You can enhance any effect toward attraction or repulsion with clever layers of consonant narrative qualities or properties. For example, you can strengthen the bond between the player and a non-player character, a place, or an item. Or, you can make the player despise that character, place, or item much more strongly than without misattributed arousal!

Now, to tackle clarification as the second major application of consonance, we need to recall our psycho-philosophical model from this level’s Opening. Owing to the abstraction processes that go into it in all four perceptual domains, the narrative qualities or properties of an artistic expression can never be unequivocal, never be straightforward, with respect to their meaning or interpretation. They’re ambiguous, uncertain, even elusive, and that’s why people love to ask the artist to explain the “meaning” of their work. But for a work of art, any meaning and any interpretation can only be temporary and contextual. (That’s one of the reasons why works of art can always surprise us with something new and completely unexpected, even after thousands of years.) But in your game, more often than not, you want your artwork to be understood! So you have to clarify, and you can use the narrative qualities or properties from one or more perceptual domains to clear up ambiguities in another perceptual domain. Again, the hierarchy is important. Especially music can “help” the player interpret certain characters or events correctly, clarify intentions, assess trustworthiness, rank importance, and much more. Employing this tool, you can give the player a much clearer picture without having to resort to terrible explanation techniques, particularly speech. Never abuse your non-player characters as WikiBears.

Fig.4.34 Mixing Perceptual Domains
Perhaps, during the course of this beat, some mischievous thoughts have tiptoed into your mind. With all these powerful tools at your disposal, from dissonance to consonance, from enhancement and arousal to clarification and interpretation: is it okay to use narrative qualities or properties to mislead and deceive the player?

The answer depends on an unassuming detail: whether the misleading element is diegetic or non-diegetic. (These terms are discussed in-depth in Level Three: Plurimediality. As a general rule: if it’s accessible to characters in the game world, it’s diegetic. If it’s only accessible to the player, it’s non-diegetic.) Let’s go through all four domains.

To fool or mislead, kinesthetic elements are not very useful, so we can leave them aside.

Mythological elements (setting, location/environment, backstory, lore) can be diegetic or non-diegetic. The player will have been confronted with some elements during gameplay (diegetic) and with some elements through marketing material, accompanying information, extended universes, and the like (non-diegetic). The former is allowed to mislead, the latter is never allowed to mislead. If there is a certain piece of lore that seems consistent and plausible but is in fact dead wrong, and the player has learned this piece of lore from the characters within the game world, that’s perfectly okay. If that piece of lore was printed on the box or came with the game description on or Steam, that’s not okay.

Elements from the visual domain that are diegetic may of course mislead. They do it all the time. But where would non-diegetic visual content come from, if not from one of the interfaces? Being misled by one’s interface will certainly not endear you to your target audience. (Breaking the fourth wall, as discussed in Level Three: Plurimediality, is an exception. Whatever is referenced, the menu, the hardware, the microphone, and so on, becomes part of the game world and is therefore by definition diegetic.)

The same rule applies to auditory content. Music, foley, speech, or silence that are diegetic and part of the game world are most certainly allowed to deceive the player. When some bouncy tavern tunes and a group of people singing cheerfully along lure the player into a den of vicious murderers, that’s okay. If a smooth and relaxing non-diegetic track from the game’s score provides the player with a false sense of peace and tranquility while the colossal equivalent of Tom and Jerry‘s Spike towers behind them, that’s not okay.

Think about it—if the player is misled or deceived with the help of non-diegetic elements, the player is literally misled or deceived by the game designer! Which is you, personally. It is very hard to come up with a hypothetical case where that would be conducive to improving the playing experience, except in experimental and/or comedic games.

Meta-diegetic elements, finally, are complicated. Dreams and hallucinations that deceive or mislead are certainly okay. Misleading or deceiving voice-over narration or soliloquies, in contrast, can backfire badly. Games have employed unreliable narrators, and these were excellent games (no spoilers here). But it’s still a bit tricky and you’d better know what you’re doing.