Monday, August 20, 2012
Robi Kauker’s Planet of Sound
Jeffrey Fleming: I'm interested in the way that the experimental background you come from influences the work that you do at EA.
Robi Kauker: I grew up in the Memphis music scene and I went to the University of Memphis for a music degree; composition, classical, traditional, and orchestra, while at the same time doing electronic music. Very early in my career I had a great teacher in my third grade, whose husband happened also to be a keyboard player, who had a Mini Moog and modular synths around at various times when he was doing projects.
I got to play with those and that got me into electronic work. And electronics kind of drove me more to the experimental music early. While Kraftwerk, Tangerine Dream, and all of the New Wave music was great, at the same time Morton Subotnick and John Cage's electronic pieces were also made available to me. So I transitioned from the pop music world to a classical composition degree.
The Bay Area multimedia world was exploding at the time. So I came out here to go to graduate school at Mills College, and basically networked, made my connections there. The guys I went to Mills College with are at Electronic Arts, they're at Activision, and they're at Skywalker Sound. It was a great meeting ground for people who were headed in the same direction I was.
It's a uniquely creative program where there are limited facilities. You're not going to see a large format mixing board and you're not going to see the latest and greatest computer. What you have, as the graduate student, are very good classmates who want to do interesting things.
At the time when I was there, there was a programming language called Hierarchical Music Specification Language [HMSL], which was a text, FORTH-based programming language, that we did interactive music with. All the generative stuff at the time was MIDI based, but we had these cool programs. George Lewis was also teaching there. He is a brilliant trombone player, who is now the director of the Center for Jazz Studies at Columbia University. He is a brilliant composer, brilliant performer, trombonist, and a member of the amazing Instant Composer’s Pool ensemble.
So there was this great creative energy there and incredibly talented professors. And all of that drove us to wild creativity that wasn’t bound by any rules or convention. That was very attractive to everybody that was there. Nobody was there because they wanted to play in an orchestra. Nobody was there because they had to have the latest, coolest gadget. Everybody was there because they had a unique vision and some idea of what they were doing.
You had to bust your ass to figure out how to do it; you had to invent it, because nobody gave it to you. That directly leads you to where the game industry was in 1995, '96, and '97. '97 is when I joined Maxis.
JF: Do you feel like you have the freedom to experiment, or is there a push to keep momentum going on a franchise?
Kauker: I am the audio director for the whole Sims label, so I make certain to put the pressure in the right place because we have games to ship. We are a business. We are about shipping games.
At the same time, there are always things to push the limits on for new products. Second-generation titles are the most fun because the entire tech is pretty well set, so you can start to push on them really hard; this year's MySims Kingdom title, we really pushed out how far we could go and hopefully, we pushed it even further.
JF: You've got all of these titles spread across a lot of different hardware, from the Nintendo DS to PC. How does it work in creating audio for all of that?
Kauker: It doesn't always work. The platforms—we grew up as a PC studio, so we like to do things in a methodology that's not always healthy for consoles. But the fundamental philosophies for the biggest PC game to the DS game are exactly the same.
So, when me or one of my people switches titles or jumps in to help somebody else, philosophically they know what they're doing. It might be a different tool set—DS and PC are never going to use the same tool set per se, but the idea that the sound is triggered because of this state or this animation or this condition, or the fact that if you see a tree in the world, the birds come with it, is carried across all platforms, so it's a philosophical issue.
There are lots of technical issues about formats, engine limitations, and platform limitations, but we work in philosophical abstraction. We do the most that we can inside that buffer.
I just finished a DS game where I had 512K for audio and I had to get music in that too. The same ambience system that I was doing in The Sims 3 title, the same concept of the ambience system, was done on the DS level.
And in The Sims 3 world, it might be a two-second, three-second, four-second bird call or bird song to build the ambience. In the DS, it's a tenth of a second and I have to create the birdcall through programmatic, data-driven stuff, not just preset samples. In the certain way, the DS titles are probably going to have more interesting ambience – at least for the geeky.
JF: So those limitations don’t bother you?
Kauker: When you learn to be a composer, one of the things you learn really early on, especially in this day and age, is that anything is possible. So how do you do anything?
The answer is that you have to define your limits and parameters. And the beauty of a system like the DS is that those limits and parameters are in your face the second you start. With a PS3, you don't have those limits in your face, and there's a tendency to wander. We build a lot of stupid, big, great tools that are solving little, itty-bitty, simple problems. Not just at EA. At every game company I've worked with or heard of, there's always somebody's version of a dream tool.
JF: What do you think of the middleware approach to solving problems?
Kauker: Middleware is like defining your orchestra. Here are your limits; this is what your middleware does. You picked it, now run with it. I think that middleware solutions are absolutely necessary. You as a small developer are never going to be able to develop a PS3 title, if you have to write all that code, debug and fix, all of those solutions. If you're spending your cycles doing that...we don't do that in EA. For Sims 3, I'm taking the best of what Spore offered audio-wise, and reapplying it to Sims 3, very little changes.
SimCity 4 took the best of what SimCity 3000 had to offer. Because it was five years old, they refactored it, made it better, made it smarter, and added a lot of really cool stuff to it, so they could do their game. It's a continual cycle of growth. Actually The Sims 2 took from SimCity 4. So it's a constant cycle of development. I don't reinvent the wheel; I steal more tech from the Need for Speed team, or the ideas from Medal of Honor, or the Harry Potter recording chain, versus the Sims recording chain. They're different tools, but the ideas are similar to each other.
JF: How did you take advantage of some of the technology that Spore was using?
Kauker: Going back all the way to SimCity 4—actually going back to SimCity 3000—we started working with data driven models for how sound worked. Spore has made tremendous use of that. From their music, to their sound effects, to the voices, to the way their creatures interact with the world. Their data-driven model of “this is what’s happening in the world that needs to be translated into what's happening in the game” is not the interactive math model where something happens and it triggers something. It's these ten conditions that are happening and then these ten other conditions...
JF: Is that an aesthetic choice or are there practical reasons?
Kauker: It's practical because there's really no other way to make it interesting. We could play a big loop, and the ambience would go [hums] all the time. That's lovely and it works for some types of games, but for our games that are user-developed, we have to vary the world constantly. With Spore and The Sims, you don't know what the world looks like beforehand. You don't know what's going to be in the world beforehand. The only thing you know is that there is a world! That makes it different by the very nature.
Going after those models, or the constant stream of information and tapping into it in an effective way is where Spore is, where The Sims is. Philosophically, it's where we've been. Kent Jolly, who is down at Spore, and I going back to our very beginning together on SimCity 3000, have been fighting that continual execution. Letting the world tell us what it sounds like. We'll make bird chirps for these trees, and we'll make bird chirps for ten different types of trees. If you drop these ten trees in, it sounds this way. If you drop all of one type of tree in, it sounds a different way. Time of day, water, context, beaches...
Music concepts, where things are going well for you or things are going badly for you, have always kind of been there. It's actually the hardest thing. Spore, I think, is the first game that starts to represent it well.
JF: How does Max/MSP work into your production process?
Kauker: The MSP stage came out when I was still in graduate school, and it was the first step to digital processing without a big expensive system. So going back again to that creative sandbox that Mills kind of fostered for us, that's where we started playing with it.
Now, in The Sims world, in the Spore world as well, because Kent Jolly should get all the credit for kicking this off in both worlds—we took these problems that we were having, recorded voice. The Sims voice-recording problem is this; we have 12,000 to 15,000 animations that need to have dialogue recorded in. That dialogue cannot be repetitive directly and it needs to be emotional, so we need voice actors. The other problem is we can't tell the player what the game's about, because our players define the game.
So that's why Simlish comes into being. A standard digital audio work session takes a minute to load up a video, name a track, get it setup for record, you hit record, you record a three second video. You record five variations of that. OK, you spent 15 seconds recording and a minute setting it up. We looked at this problem, and we went "Yuck."
So, we solved it using Max/MSP to build apps. We built a recording app, we built an editing app, that's a companion that scripts drive. It's very, very fast, it's very efficient, and it takes advantage of all the techniques that Max/MSP lays out for you. It's fairly simple. Jitter was the video component of it. That was the key that enabled us to do the tools that we did.
So it makes our recording pipeline very fast, very efficient, and very customizable. So what we did, Tiger Woods PGA Tour took and modified. They literally took out a huge chunk of what we did, and put in a simpler recording model.
The fundamental problem remained the same, was the way to quickly record through a script, without huge issues of using an alien interface that's designed for general purpose.
In The Sims 2, the voice of the robot, who is essentially a character of the game, is generated with a plug in I wrote for Max/MSP, a VST audio unit. Nothing fancy, not rocket science. But it was something I could make that could be used in a VST or AU audio chain.
We use it all the time now to mock up prototypes any sort of game scenario we want. We have USB game controllers mapped to it and then we go to the set up any time we need a prototype.
JF: Does that work in the runtime or do you have a different system?
Kauker: It can work in the runtime. I don't do that in The Sims world. I know some of the other teams are mapping it into runtime because it becomes a great mixer for it. You can lay out your custom mixer, your virtual layout of the mixer through an Ethernet cable or however you like to work.
I tend to build a very simple implementation pipeline, so those tools become kind of overkill for The Sims world. And by simple, it is very complex on the engineering side but not on the implementation or the content side. That's what I see as the conundrum of video games. Where do you put the complexity—do you put it on the designer or do you put it on the engine? Well, engineers are usually smarter than designers [Laughs].
We use it for prototyping a whole lot. The new MySims character dialogue is a simulation, actually. We made a plug-in, which simulates what we do in game that is a pitch-shifting algorithm. And we made that into a VST/AU situation for prototyping and demoing—every actor’s voice that we work with has to also work in this very weird pitch-shifting algorithm. We have great actors that I'd love to work with, who audition for us and we are not able use them because their voice breaks up in this highly efficient pitch-shifting algorithm. We have to know that quickly. We can't wait to get that content in game. So we've modeled it with this pitch-shifting algorithm.
And that stuff pays off for me not just in game development on the front end when I'm auditioning talent but also when I'm doing marketing later on. I have all this content that needs to be matched up for marketing and that marketing may not be done by us, it may be done by an advertising agency.
If we have external developers working on websites or something like that, we give them this plug-in. Here's the sheet, and these numbers correspond to these voices in the game. So that pitch shifting has actually become a character in the game. We have eight actors in that game but we have forty different character voices. Just that simple bit of technology pays off huge dividends for us.
JF: How do you see game audio developing in the future?
Kauker: I think that what really matters is to keep pushing the tools so that the production of game audio reaches the same level as audio for film. Nobody thinks about how to do sound for film, they know how to do film and sound. But we're still thinking about solving basic issues in audio for games.
We've been taking these little steps in audio, to a certain extent, because we've only seen so much of the CPU set aside for us. But the reality is now the machines are as powerful as they should be. Now we need to get the tools and the pipelines efficient enough to where we don't have to think about them. Getting the right sound is not hard; we have great tools for that. Implementing a great sound is still more difficult than it needs to be, and implementing that great sound in a way that grows and changes, and reflects what the player is doing is really hard. The only thing we should be thinking about is how that great sound affects the player and we're not to that point.
I look at so many game audio teams and they're typing numbers and they have to look at everything as a spreadsheet or a numeric breakdown. As a musician and as a sound person, I'm used to feeling sound and I don't feel a number. Zero to 1024 means nothing to me. I can tell you minus three dB at 100 hertz, but that means nothing to the engine. That gap has got to be bridged so that you can really expose the talent of these people.
Even further, the physical world has to be bridged so that things move naturally. Right now we hold a microphone in space and whatever it picks up—that's what the player hears. No film in the world is shot like that. You would have lav mics hanging on the collars of every shirt, and a boom mic mixing in the ambience that's going by randomly. We don't have any of that at any level of detail at this point for audio in games. Or worse, we’re not making very good use of it yet.
For visuals, we're spending a huge amount of time making sure shadows are right. We're making sure that flares on the lens are right. In audio is reverb the equivalent of a lens flare? Probably. Is it the equivalent of a shadow? No. That's getting into filtering—where every object in the world affects the way a sound moves in the world. How many hours have people spent on making shadows work right versus how many hours have they spent making sure sound moving in a room is right?
JF: Let me ask you about avant-garde music, noise, and how those influences might work their way into games. I tend to like abstract music, and I don't get that very much in video games.
Kauker: I kind of look at it in the same way I look at it in film. If you take Stravinsky from 1920, that was considered avant-garde at that time. Now Stravinsky is pretty standard in film music. You can hear all the influences of Stravinsky in film music today. It took a little while—Stravinsky even did film music at one point.
So it's a natural evolution. What we are doing today on the bleeding edge of avant-garde noise or whatever it is we call it now has a potential for these mass entertainment products and some of it is going to apply in great ways. I think if you play a game like BioShock you’ll hear the influence of Charles Ives. You have two or three little pieces of music playing at the same time, playing different melodies, taking you to different parts of your memory. I think that is a really cool thing.
You're going to hear a lot more of that sort of juxtaposition and layering. You hear a lot of minimalists already. Again, Spore, is a perfect example that goes straight back to Steve Reich, Terry Riley, and Philip Glass.
It will come. It is an appropriate thing. Basically, you need a game space that allows you to explore the strength of that style. When you say avant-garde music, it is very strong stuff. It's not designed to be hidden in a video game, it's designed to grab your throat and get your attention. As composers, and as designers, and as listeners, we have to find the things that allow us to connect with the game. I can totally see doing a war-based game, a space-based game using the music of Harry Partch or the music of some noise artist.
We've already co-opted all the electronic music early on for sci-fi and things like that, totally co-opted rock and roll for all these driving and adventure games. Techno—you can't do an adventure scene that doesn't have some homage to techno in it, even if it's the orchestra playing it.
It'll come, and it will come in doses, and it will come from independent developers, and it will come from interesting game spaces. I know that it's coming, just like it is coming through the dance world or it's coming into opera, or it's coming into television or film. It just takes a little time to find its spot where it's effective. It will come much faster than it took with Stravinsky.
JF: Have you ever used audio to subvert things on a game a little bit?
Kauker: Oh yeah. I had a producer come to me once with concern about the Medical Research Center in Sim City 3000. When you click on it there's a sound of a dentist drill and a monkey screaming and the concern was that animal rights people would be upset about it. It was funny to me because at the time I was a vegetarian.
We always subvert things, we always find ways. When you play The Sims, if you play the piano at the highest skill level, it goes to Liszt or Rachmaninoff at level nine and skill level ten is pseudo-Cecil Taylor and you're slamming your forearm into the piano!
We were always doing things like that. The Sims Online has way out experimental electronic music couched under Ambiance. So we're always messing with whatever we can. There's really some extremely cool stuff being done under the name of Ambiance, especially in the space games and horror games, where they get to play around a lot more. But yes, subversion, that's half the entertainment, right!
Just to give you a different example, Chris Brown, who is a really well know composer on John Zorn’s record label [Tzadik] makes crazy upside down avant-garde stuff. He wrote a lot of really cool jazz for me for The Sims 2 and the University expansion pack. The Big Band Jazz is all done by players who are better known for their Free Jazz work than their Be-Bop chops. So I like to give these guys things like that and then they give me the cutting edge of how far I let them go. So it's a lot of fun!
JF: One of the major things about The Sims that I think sets it apart is that it appeals to men and women equally.
Kauker: Worldwide. It's a culture, worldwide, men and women. From the very beginning, working at Maxis I didn't realize the game industry was so male-dominated until I went to a Game Developers Conference. Because the people I worked with were mixed all over the place. Designers, engineers, artists—the audio team now at the Sims label is half and half, men and women in all roles. There is no role dominated by one or the other.
JF: That has got to make a difference.
Kauker: It does. It makes two differences, actually. One is that when we make a sound it's taken into account that the effectiveness of the sound plays both ways. Men and women hear differently. They think differently. They react to different things. Music is a different animal. Having that gender balance—not even intentionally, it just naturally evolved—continually adds to it.
Secondly, it makes the workplace a lot different and a lot more relaxed. I have worked with all male teams in record labels, the record industry, and music. Mostly the music and audio world is male-dominated and there is no reason other than history. Physiology says women hear better than men, and for a longer period. The music, the art, the craft of sound is better when it's not just a whole bunch of guys beating their chests.
I've never tried to put my finger on it. I just hire the best people for my team. For my team it’s just worked out that it’s grown that way. Maybe it’s because of that experimental music background that most of my people have. All of them have an interest in it. That they continue to come through and I just happen to get the cream of the crop and the cream of the crop just happens to be gender neutral. I don't know what it is. It’s a great joy. Gaming across the board would benefit from taking a wider perspective of things.
JF: At Game Developer magazine we do our yearly salary survey of developers and it’s always disappointing to see how females make up such a small percentage of the talent side.
Kauker: It’s a very odd thing in the gaming industry because the rest of the art world doesn't reflect that. The film world doesn't really reflect that as much, although preproduction film maybe does reflect it. It’s a very odd thing and a very old thing. It's almost like our version of suit and tie. We're hanging on to this because that's the way it has always been. Or maybe we’re so narrow minded that we’re not attracting the best talent. We're only attracting half of the best talent. Maybe were not attracting the best talent at all, we're only attracting the talent that's interested in what we are already. It’s an interesting problem and it’s a developing problem because we're reaching mainstream status. We're no longer a niche market. To reach mainstream status you have to reach everybody.
– Jeffrey Fleming
This is an expanded version of an article originally published 4.13.09 on Gamasutra. You can read the original at: Planet of Sound: Talking Art, Noise, and Games with EA's Robi Kauker.