Motion Capture System Using Xbox Kinect
Kanglei Fang
Kanglei Fang

Motion Capture System Using Xbox Kinect

Kanglei Fang, Senior iOS Engineer

VR is definitely one of the hottest trends in the technology circle right now and Prolific is no stranger to working with innovations.

Over the last few months, Prolific has been working hard on a VR experience. What started as an internal hack-day project turned into a low-cost VR game development pipeline within just a few weeks.

The Team

Due to the combination of the team size and the project requirements, we had to establish a special production process. To begin with, here is what we had in terms of resources:

  • 2 Artists
  • 1 iOS Engineer
  • No dedicated equipment
  • Limited budget

We didn’t want to reinvent the wheel when we only had limited resources. As a result, we couldn’t pull off developing a VR experience without establishing a very efficient pipeline. Our process would have to utilize only necessary resources and produce high-quality animations with as minimal man hours and budget as possible.

Tech choice

Traditional VR game development processes cost quite a bit. As we didn’t want to take a too big of a first-step, our goal was for our setup to be simple, but sufficient, allowing us to move forward without any concerns.

Platform – Google Cardboard

As a mobile driven company, we know everything about how to design a good product on a mobile device, so Google Cardboard seemed an excellent choice as our first VR product platform. Right now, Google is switching to the DayDream platform, which has a higher hardware standard for its supported devices, some cool accessories and a more comfortable headset. However, at this point we think Google Cardboard reaches out to more audiences, including VR-interested iPhone users.

Engine – Unity

Unity is a relatively low-cost, designer-friendly game engine which is really suitable for fast prototype validation. With some drag-and-drop within the Google Cardboard SDK demo project and some modification, the VR camera was set up within one hour.

Motion Capture System – Kinect

This is the most interesting part – the project is an animation-based project, but manually-made animation was nearly impossible for our timeline. A decent motion capture (mocap) system would save us months of work. However, traditional mocap systems or services are prohibitively expensive – a two-camera Vicon system with one software license is $12,500, with the software license costing $4,200 per seat. So, we decided to build our own. Luckily we have an Xbox Kinect in our game room, which is a device tracks people’s skeletons and motions, and can be repurposed as a motion capture device with some modifications.


Microsoft has released a Unity SDK to support Kinect development on Unity. With the SDK, we can read user’s input in the format of a skeleton map and a list of joint points in 30fps. Given an existing humanoid model, we can use that data to manipulate the model’s corresponding joint – with some built-in noise filtering feature tuning, we got a simple motion tracking system ready to be used!

Don’t get me wrong, while this is a easy-to-use, extremely budget-friendly mocap system, there are certain drawbacks you can’t get away with. Here’s a simple table comparing aspects of motion capture systems I’ve played with before:

Animation Post-Pipeline

The lack of accuracy in the mocap system is fixed in the post-pipeline. For animations captured by a high-accuracy mocap system, it is typically done by dragging the misplaced rigging points in the timeline. For Kinect, the broken points for captured animation are often clustered at the hands and feet, leading the arm and legs to look shaky and unstable. Since we don’t have the manpower to fix those frame by frame, we had to address this problem with a different approach.

Inverse Kinematic

Unity’s inverse-kinematic feature comes in handy. This is something that allows us to move the character’s hands and feet at runtime, and leads the whole character to move naturally based on the original animation layer. This is especially useful when the animation is involved in interacting with other objects and some subtle animations that are too small to be accurately tracked by Kinect.

Combine Animation Layers

Unity provides some well-made animations in their assets store for free. While they are very limited for any specific purposes, they work really well when served as replacements – in this case, for our flawed animations tracked by Kinect.

Kinect has a very limited tracking area, making gestures like walking really hard to track. In the beginning we tried to track walking by making the actor walk in-place. Then we realized that you can never fake natural walking without actually moving forward. So, we did the best we could, and then swapped the bottom part with a Unity animation mask.

Doing this inevitably introduces some extra states in the animation, but by writing some state machine behavior scripts, the transition trigger could be easily sorted out cleanly.


To summarize, our development pipeline had following process:

  1. Design the story scripts and dismantle them into separate and reusable motion movements, segmented by head, torso and legs if necessary
  2. Setup motion capture sessions in our game rooms with Kinect and the list of necessary motion movements
  3. Assemble the motion movements using Unity’s Animation Layers and inverse kinematic
  4. The rest is just a traditional game development utilizing those animations

As a lot of animation-adjustment typically done manually was being taken care of by the code, we were able to spend more time on creation process and faster iterations. In the end, we were proud to achieve a VR-animation project production pipeline suited for a small-sized team with a limited budget.