Joe Letteri is a four-time Academy Award-winning visual effects guru and a Partner at Weta Digital. Joe recently led his team in creating the groundbreaking effects in Matt Reeves’ “Dawn of the Planet of the Apes,” a film that has amazed and delighted audiences on its way to earning more than $500 million worldwide. I recently had a chance to speak with Joe about Caesar, acting, and evolution.
What was your creative process like in working with Matt Reeves?
Early on it was trying to work out some of the design ideas for the apes, because we had this idea that the story is going to be 10 years or so into the future, and we established in the first film that they’re taking the drug that’s making them more intelligent. So the first question was: how much of the effect of that are we going to see when we open the film? What will the changes be? So that was leading us into an aspect of the story that we needed to explore, how much were the apes going to be speaking in this film? That was really one of the big initial creative considerations, how do we bring audiences into that aspect of the story?
We played with a couple of small design changes, with Caesar we made him a little bit older, a little bit grayer, he’s put on a little bit more weight than he had in the first film since he’s 10 years older. We looked at some dialogue tests for the apes and we specifically did a test scene between Caesar and Koba to figure out how they would talk. We hit on the idea that talking doesn’t come naturally to them and so you have this situation where you almost have to draw the words out of them, as if their brains are kind of outracing the physical evolution of their vocal cords to be able to conceive the words and to utter them.
We did the tests and then crafted them into the storyline where you begin the film and you’re just seeing the apes in their community, they have the sign language that they were learning at the end of the last film and you can see that they’ve become proficient in it, they’re communicating with each other but they don’t vocalize until the humans arrive. When the humans arrive, suddenly the need to vocalize is there because Caesar has to communicate with the humans and that sets off a whole chain of events where the apes now suddenly have to communicate more with themselves and the sign language starts to give way to more vocal communication. So there was this arc that drove this whole dynamic and it’s reflected in how they speak. So the question we had to solve was how do we make apes speak without having it look like a man in a suit? We wanted that sense of this being new to them but it had to come out of them, it had to be driven by events and it had to be supported by the character designs.
What are some of the unique challenges of shooting outside?
Motion capture outdoors and performance capture outdoors is kind of a new field that we’re entering into. The evolution of that started with the “Lord of the Rings” when we decided that we did want to try and use Andy Serkis’ performance directly to drive Gollum’s performance. We got him on a motion capture stage and had to recreate the performances that he was doing with the actors that they were filming and that worked well for us, we were able to use that throughout “Rings” with a number of the sequences, still a lot of keyframe involved, we used it for a lot of the sequences but we keyframed his face for all of Rings. Then when we came to do Kong we wanted to see if it’s possible to capture the face, to translate that, and so we came up with a technique for actually capturing his face by using motion capture markers on his face and working out how to do the solves and the translation into Kong’s character. In “Avatar,” Jim Cameron wanted to combine these ideas and have the actors wearing a head rig to capture the face information, that way it gives them more mobility to move on the stage and throughout the virtual world which we were shooting with a virtual camera.
When it came time to do “Rise of the Planet of the Apes”, we thought wouldn’t it be great if we could now actually capture Andy in the moment, he’s giving us these great performances on set with the other actors so can we capture it? You have the challenge then of motion capture cameras needing their own set of lights to track the markers on the actor, and these lights can interfere with the film camera and vice-versa. So we came up with the idea of doing active LEDs, infrared LEDs, we would illuminate them out of phase with the film cameras so we could actually capture the action and not interfere with it. When you’re doing this kind of performance capture there’s a lot of computers and camera and equipment and everything that can be fairly sensitive if you want accurate data. We also had a few scenes where we had to take it outdoors and that worked surprisingly well for us.
On this film, on “Dawn,” Matt really wanted to go out into the forest, he wanted the naturalism, the rain, the mud, just being far away from civilization. That meant that we had to come up with a whole way of essentially weatherproofing everything. Making everything standardized, ruggedized, wireless, impervious to the elements as much as possible. The accuracy of the data is really important, if you don’t have it then you might as well not go to the trouble of bringing all of the gear out there. From the capturing side there was just a lot of investment in the technology to be able to move from the studio out to location. It’s similar to what happened years ago when film cameras started becoming lighter weight and lenses became faster, film stocks became faster and you could start moving films from the soundstage to any kind of location. That’s the kind of transformation we were looking to do here, wherever we needed the ape characters to be, we needed to be able to go and record their performance as part of the filmmaking process. That’s actually one of the breakthroughs of this whole idea of performance capture technology starting with what we did on Rise, because filmmaking has not changed in a hundred years, you have all of the same departments—cameras, grips, make-up, costume. And then on “Rise,” the crews made room for a new department, which was our performance capture department, they had to be there working with everyone else, making sure everything worked and it was ready to go when it was time to roll cameras. This really worked well for us on “Dawn,” this integration of performance capture as a new unit is what made the film work.
How were you able to use the active LEDs to capture the performances under suboptimal light conditions such as in a dark, rainy forest?
Because we’re using the active LEDs we’re not relying on reflective light. Traditional motion capture cameras have a ring light around the camera and a reflective marker so they’re looking for light that bounces directly back into the lens. With our active LEDs, they send the light out and the cameras are filtered to just look for that spectrum, for that IR spectrum. It’s a little less robust on the face of it, because you do get a lot of spurious signals like reflected sunlight off of water, things like that will tend to confuse it and a lot of it also depends on how well you calibrate the gear and how well you solve the information after you’ve captured it.
How did you deal with the issue of wet fur?
Wet fur was done through an additional level of simulation, so we would run our water sims through the fur and then just try to get the weight of gravity of the fur hanging everything down. We were really just trying to simulate what the water would do—getting the droplets to run down the fur, some get left behind, the fur will clump together as the water is running down it, really everything you observe in physical fur we try to do as a simulation technique on top of the fur dynamics that have to happen naturally because of the motion of the characters.
What’s the difference between motion capture (mocap) and performance capture (pcap)?
When we were doing it back on the “Lord of the Rings” we were calling it motion capture because what we were doing was putting reflective markers on the actor’s body and recording the motions of those markers through space. It was a very mechanical process, what you’re really looking at is joint rotations and joint angles, you’re capturing the skeleton.
When we started capturing the face, the mechanical nature of it isn’t enough, because with the muscles of the face—other than the jaw—you’re not looking at a mechanical structure, you’re looking at layers of muscle that all interact across the face in a very complex, three-dimensional way. And so what we really had to do there was to analyze frame-by-frame the patterns that the skin was making on the face and from there try to infer what the muscle activations are, and how the muscle activations work through time and space and then translate those activations into the characters’ performances.
We started talking more and more about performance and when it came to doing this for “Avatar,” we were doing it as a completely integrated solution of body and face and translating that directly to the character. It was Jim Cameron who said, “well really what you’re doing here is performance capture” and we just started calling it that because it was no longer a piecemeal kind of solution, it was trying to get the whole performance together.
Will actors one day become unnecessary?
Filmmaking in general is a very collaborative art, and so to create a character that exists only in the realm of the imagination you need to bring a lot of talent to bear on that.
Working with actors gives you the spontaneity, the choices in the moment, the drama, the interaction, all of the things that great actors bring to their craft. We do have to supplement that with what the animators can also bring to the character, because we are not creating human characters, it’s not a 1-to-1 translation. And so at some point the animators have to take over the performance and guide it through the final stages of giving it the reality of the character that you see on the screen. Some of those are subtle, mechanical adjustments like a human trying to walk on all fours like an ape—you have to reposition the skeleton, you have to change the pose but you have to keep the intent of the performance. Sometimes you have to do things where there’s no actor involved, such as some very dangerous stunt action. Sometimes the animator just has to keyframe it. So it’s all part of the crafting of the performance but the ability to work with actors gives us a richness to the performance that’s hard to come by any other way. Can you create really believable, engaging characters in other ways? Of course, you can do that in animation all the time where you can just record the actor’s voice and then have animators keyframe the performance on top of that, but if you think about that, there’s still an actor involved, the way the actor chooses to deliver those lines. It gives the animators clues about what the characters’ performance needs to be to support those lines. That kind of actor-to-character translation to the screen exists on a spectrum and it just really depends what the story is that needs to be told and what style you’re trying to tell it, so much of it is relying on the talent of the actors because if we’re doing photorealistic work and setting it in a live-action film as with Apes then we rely heavily on the actors to ground that performance in reality.
What was your workflow (modeling, rendering, compositing) and how did filming in 3-D factor in?
We used a mix of tools. We use Maya as our standard 3-D platform but we’ve written a lot of plugins to customize it for things like our puppets and especially our creature rigs. We do all of our muscle solves with a system we call Tissue, one that we wrote starting back on “Avatar.” We wrote it to do muscle solves, for finite element solves, trying to solve the fiber bundles in the muscle. And then for the skin and fur dynamics there’s another piece of software that we wrote, called Synapse.
For rendering we primarily used Renderman but we’ve written our own renderer that we call Manuka. It’s experimental so we can look at new rendering technologies outside of production constraints. We were able to try some shots with it on Dawn and it worked out really well for us, so it’ll be a bigger part of production for us with “The Hobbit.”
We used Nuke for compositing and we still obviously keep pushing what we can do with deep compositing because that’s been a really important tool for us ever since we developed that for “Avatar.” It’s primarily to deal with the issues of stereo compositing, dealing with these complex scenes of apes running through a forest filmed from two cameras, and making them tractable. You really couldn’t composite shots like that if you didn’t have deep compositing because you can’t break it out into two-dimensional slices, the scene is too complex for that. We pushed all of these things further into physical reality and then we try to overlay that onto the reality of what we capture.
In the last sequence of the film, the big fight on the tower, that was all completely digital, we had no live action in any of that. You couldn’t physically build any of that, we had to build the unfinished skyscraper because the fight was happening on all different levels, we only used performance capture to get all the key beats, to work with the actors and the stunt team to work out the physicality of the fight, then we had to put everything into a big virtual world and just do it all digitally.
When you first watched the film with an audience, what were some of your thoughts and feelings?
I remember thinking that it was very easy to get into the world of the film. I thought it was brilliant what Matt did by opening on a close-up of the eyes so you’re right there with Caesar’s thoughts and you’re pulling back and you’re revealing his world. In that moment, I got the impression from the audience around me that everyone was drawn into that world and they were right there with the story. I thought ‘that’s a great way to make a film’, because that’s exactly what you want with a story like this where you’re bringing people to a new fantastic world, you want them immediately with you on that journey.
Are there any particular sequences of which you are most proud?
There are many. Just after the opening, the sequences where the apes are all sort of just sitting around, socializing, and you’re getting a sense of their life, it just felt very intimate and quiet and you were just drawn into their world. I loved Caesar’s performance because he was the dramatic heart of the film and the way you saw his conflict and the way he tried to resolve it in a very noble fashion was just terrific. I loved Toby Kebbell’s performance as Koba, he just had some fantastic moments there. Everyone sort of had their turn to not only shine but to help you understand the world and the dynamics of the story.
Are we at the point now where we can do anything with VFX?
We’re at the point where we can do pretty much anything that you imagine, it’s really dependent on context: what is the story that you want to tell, what is the effect that you want to have, and how much of it do you need? In the case of our story, it required us to enter into this whole world and create dozens and dozens of apes that would actually hold up to that level of realism and that level of performance that close to camera. So that becomes a much bigger effort, but again it’s dictated by what the story requires. That’s really what drives this, what is the story we’re trying to tell on the screen? I mentioned earlier about how the apes talk, they could have been chattering the whole the time but dramatically that wasn’t what the story needed and so we tailor what we’re trying to achieve to what the film needs.