ICM Powering Medical Imaging Applications

Some really cool videos hit the web last week. Unfortunately, they are in Portugese, so we English-speakers won’t get all the details. What you can see though is a very cool application of Activate3D’s ICM technology and PictureViewer sample. An evaluator in Brazil made some awesome extensions to that application to support medical imaging formats and deployed it in operating rooms at Hospital Evangélico de Londrina. Surgeons can preview scan data without breaking the sterile field using Kinect and ICM!

YouTube Preview Image

Here’s a link to a longer verison.

Thanks to Daniel for sharing this back with us and making such a cool use of ICM!

The Activate Team.

ICM 1.2 Released!

ICM 1.2 is now available from the Activate3D website.

Notable features:

  • Improves motion recognition and discrimination particularly when the actor root rotates.
  • Adds support for Kinect for Windows SDK.
  • Adds built libraries and full support for Visual Studio 2010.
  • Newest versions of all ICM demos and showcases:
    • NUI for Netflix – Control Netflix with gestures.
    • NUI for Angry Birds – Control Angry Birds with gestures.
    • Virtual Vacation – Composite yourself into exotic scenes.
    • Picture Viewer – Updated with new hand stabilization and better on-screen touch indicators.
    • ICM Demo – Updated version of our Unity sample sandbox with a new avatar.
    • Kinematic Actor – Drive an augmented reality experience with a physically interactive video feed.

Release Notes:
  • There is a minor bug in the Activate3D unity package. SplashScreen.cs is missing, but it can be found in %ICM_SDK%\..\Samples\UnitySample\

Download it now, and let us know if you have any questions by visiting our forum!

The Activate3D Team

Downloadable Apps and Videos

We’ll be working on our website in the coming days to improve our downloads section. For now, though, we wanted to share some of the things that have already gone up on the OpenNI Arena.

Although it’s the second demo we’ve posted to the Arena, we figured we should mention this demo first. This demo shows a small interactive environment we assembled in Unity that has bars, ropes, ziplines, and a pit full of dynamic balls. It’s a great demonstration of what ICM can do to enable physics interactions in a virtual world powered by a depth camera like Kinect. Plus, it’s the number one rated application on the OpenNI Arena right now!

You can download the demo here, and there’s a brief video below.

YouTube Preview Image

We’ve also released our Picture Viewer application. This little utility shows a touchless multitouch interface for manipulating pictures via a depth camera. Imagine using it to organize a slideshow in your living room! Right now, the feature set is fairly small, but we think there are great opportunities for it to grow. We’d love to see what people think. Check it out and let us know what you would add to it.

YouTube Preview Image

Let us know what you think.

The Mouse is Not Dying

Johnny Lee posted a really great talk about user experience design the other day that we thought was worth sharing and commentary. His original post is here although we’ve embedded the video below.

As someone working on motion controlled software, we viewed this talk with a certain perspective. There’s a lot to takeaway just for thinking about motion gaming and gesture design. Even though it’s probably true that the mouse won’t die as Lee asserts, there’s more room for gesture than he posits as well.

Gesture Works for Larger Displays and Primarily for Consumption

About two minutes in, there are some great graphs about what interaction modes work best for different size display devices. Lee’s core idea that gesture works best with larger devices makes sense. You can be farther away which gives you room to move. However, his example line stops short of tablets when it should have a longer tail. We’ll see gesture showing up in tablets as it makes a lot of sense when you don’t want to touch the screen. Cooking might leave you with dirty hands and a need to scroll. In fact, Elliptic Labs showcased gesture tech for the iPad at trade shows recently.

The other set of graphs showed a breakdown between production and consumption. Again, the graphs are illustrative but define things a bit too narrowly. It’s easy to agree that the mouse and keyboard dominate production. They allow pinpoint selection of GUI items and the ability to generate text that far exceeds other input options. If you expand your idea of production and your idea of content to include more items, however, production can and will occur with larger displays. We look at ICM very much as a way to create motions in real-time. We’re taking your user input in the form of motion in front of Kinect and synthesizing a new output motion on screen. As applications progress and emerge, more expressive motion content could be produced that’s not just used for real-time. Create your own dance in an MMO, or add your own kicks and punches to a fighting game.

User Experience Design Should Consider Failure Costs
Another very interesting point comes up around 3:40. The overall user experience needs to consider the cost of an interface’s failure to capture intent. With motion and gesture interfaces, this is such a critical point. Consider the failure rate and cost with a keyboard. If you get a ‘g’ instead of an ‘f’, it’s very clear that you hit the wrong key. It’s hard to get a false positive or negative unless your keyboard is actually malfunctioning. It’s also easy to correct with backspace.

Now consider failure cases with gesture and motion. False positive becomes much more of a reality since so many gestures are similar. Is your intent to swipe and scroll the screen, or are you just talking with your hands to someone else in the room? This is an actual issue we encounter with gesture UI apps. We typically disable a lot of functionality when the hands drop down to a certain level to force a user to raise their hands up and engage. It’s also very difficult to convey in a motion or gesture interface why a gesture failed to match. This difficulty makes it hard to correct for a false negative. Why didn’t that swipe work? Am I moving my hand wrong? It’s critical with motion and gesture to make sure the user knows the modes of interaction at any moment and gets good feedback on them.

As we solve some of these problems in the motion control space, we’ll expand ourselves to more user experiences. As Lee says, the mouse won’t die, but the space for motion and gesture is pretty big too.

1.1 Free Community Edition Released

In February, we released a community edition of our 0.8 release. Although we had a 1.0 release of ICM in April, we did not ship an updated community version at that time. Today, we’re excited to share that we’ve released a new 1.1 version including a refresh for the Free Community Edition for Windows. If you’re impatient, you can download it here.

For the Kinect community at large, the newest and coolest feature is the Unity 3 (Free or Pro) integration. The 0.8 version of the product was much harder to pickup and play with because it really required you to modify the code. There wasn’t a level editor, and there wasn’t even a good file format for modifying a scene. With the Unity integration that problem has been greatly alleviated. Users can now drag and drop features into their level to create a world they can explore and interact with using Kinect and OpenNI.

We’ve tried to expose a lot of the functionality to the GUI layer in Unity. For the things you’re unable to do through the GUI, we’ve exposed a great deal of our API to .Net via SWIG. The 1.1 community edition also includes our native and managed binding layer code so that if you need to expose additional things or need to do something only available in our native C++ API you can take advantage of our existing SWIG code to wrap your new functionality, instead of writing your own wrapper layer or SWIG interface from scratch.

The new 1.1 version of ICM solves many of the problems OpenNI users encounter and more. All of these ICM features are usable out of the box with a couple of mouse clicks,

  • Skeleton stabilization and retargeting – get the ICM filtered skeleton applied to a character in Unity through a simple GUI. This includes further refined and stabilized hand positions for GUI work.
  • Gesture/Pose detection – A variety of detectors for picking up poses and full body motions. There’s a locomotion action, walk and run, included in the package.
  • Several grasping solutions for engagement/disengagement with physical objects.
  • Full integration with PhysX allowing for foot planting, physical object collision, and physical interaction of the avatar with the world.
  • Sample interactive features:
    • Bars/Poles
    • Ropes
    • Floors/Walls
    • Water
    • Ledges
    • Dynamic Boxes/Spheres
    • Jump Paths (Provide assistance to make sure players reach the next ledge or feature.)
    • Ziplines

Here’s a quick shot of a solider on a zipline crashing through a bunch of crates all while Nick makes his legs kick out with 1:1 mapping.
And here’s a short action sequence.YouTube Preview Image

As noted above, you can download ICM 1.1 FCE here. Tell us what you think in the comments or the user’s forum. We’re anxious to hear. We know that there’s plenty more for us to do, and we’d love the community’s input as we move forward.

We’re very excited about this release and what the community can create. Keep your eyes here for more updates and videos over the next week or two.

The A3D Team

ICM Presentation in Unity3D

We recently gave a presentation to a company interested in using ICM for some projects. We decided to give this presentation inside a virtual world powered by ICM using Unity3D. Here’s a brief video that shows off some of the concepts.

There are a number of really cool things here:

  • ICM data can be mapped to any character in Unity using a simple GUI with full physical simulation.
    • You’ll get feet accurately remapped to the floor, for example.
  • Locomotion and other ICM actions are supported allowing you to walk around and interact with the world.
  • The demo shows both 3D and 2D interactions using ICM-stabilized hands and skeleton.
  • Vision-based grasping and gestures used for UI interactions.

We’re hoping to release an updated community edition of ICM very soon which would include the Unity3D integration, this sample, and more. Keep your eyes here for more information. We’ll post more videos as the community edition gets ready for release.

Do the Truffle Shuffle to Start

Preface

The first time I stepped in front of a depth camera was almost a year ago now. We had a reference version of a PrimeSense camera that is heavily related to the final hardware that went into Kinect. The first thing I got to do was make a stick figure guy move around on the screen. It was very captivating to see him match my movements, even with the occasional arm through my chest Kali-Ma style.

Those first days were filled with lots of experimentation because everything was new to us in this world of full body motion gaming. Which reminds me…if you ever want to see a cool effect, go grab a large mirror and hold it in front of a depth camera at an angle; now you’re really playing with portals!

Introduction

With all the time I’ve spent around these cameras I wanted to capture some thoughts on some problems developing games and software driven by full body motion input.

Unnatural User Input

Lately I’ve come to find the statement “Natural User Input” a bit of a misnomer. There are still many technological and human hurdles that have to be overcome with time and good ideas before the interaction is truly natural. The problem with natural is that it’s different for everyone, which generally forces you to make it unnatural for some group of people. Also with the limitations of the current technology you will often find yourself making unnatural concessions to make something work.

A great example of this is getting detected by the camera, often referred to in the office as “Doing the Truffle Shuffle”. Some skeleton SDKs require a pose or gesture to be detected as an active user. For example, OpenNI has the “Psi” pose. Some ask you to wave your hand. Some just work, like Kinect but even so many games have logic layered on top about when a user can join that is highly varied and currently unnatural because there isn’t one consistent way yet.

Another good example of this is turning. If I asked you to turn, how would you do it?

Q: Would you naturally turn, away from the TV?
A: No, then you couldn’t see the TV.

Q: So would you turn your whole body and continue to face the TV, or just your shoulders?
A: If you turn your body naturally you’ll occlude half of your body, making it harder to detect other actions simultaneously (Walk + Turn). Also many skeleton SDKs have varying levels of success tracking shoulder angle and occlusion of the shoulder usually causes them to move around.

Q: What about if we let the hands determine turning, moving them left to move left, right to move right?
A: Good in some contexts, like skiing and horseback riding. It’s very unnatural when walking around. It also prevents you from using the hands to do other things at the same time. It’s also very hard to hold for long periods of time if you have to keep them there.

Q: How about leaning left to turn left, leaning right to turn right?
A: It’s great from a technological standpoint. It won’t ever occlude any part of the body. Very easy to do for all users. Very easy to hold for long durations. Can be combined with many other actions. However, it’s completely unnatural.

The best advice I can give here is to get people to test out your ideas. I can’t tell you how many times I’ve thought to have solved a problem only to see a tester or coworker break it almost immediately. If you can help it, find new people to try out the game. We refer to them as untrained users around the office. For these systems you’ll find that over time the system trains you back. You learn just the right movements without thinking about it, which will lead to a false sense of improvement in your gesture detection code.

I haven’t seen it happen yet, but I suspect many motion games in the future will actually ship with multiple ways of handling the same input and users will select the one they prefer. In the same way we have inverted controls and different control schemes.

Noise

The cameras are not perfect and they’re mapping a physical space to some finite number of pixels. Surfaces that poorly reflect infrared, other infrared sources (like the sun), and even the manner in which the cameras define a contiguous surface can cause variations from frame to frame leading to lots of jaggy shifting edges on objects. This jitteriness influences the volume of an object and thus the calculated positions of bones in a skeleton are shifting too.

So you’ve got to find a way to smooth out the data without adding lag to the propagation of player movements onto the character. The best way we’ve found is with a predictive filter. They average in old frames with the current frames data, but are simultaneously predicting N (usually 1) frames forward in time. The only drawback is they end up over and undershooting the actual curve of motion because it’s predicting the motion is going to continue in the same direction. Luckily this largely goes unnoticed by users.

Generally Avoid

The amount you should avoid each of these varies across cameras and skeleton SDKs, but generally speaking this is my own list of things you should try to avoid.

  • Small Motions – Detecting them is very difficult, they are very easy to confuse with noise.
  • Holding hard poses – It’s hard to hold your arms out for extended periods of time.
  • Motions near the body – Occlusion problems, bone loss.
  • Fast motions – Most of the consumer grade depth cameras right now are running at 30 FPS. It’s very easy to move faster than the segmentation / skeleton prediction code is willing to bet you’ve moved and will happily ignore your motion.
  • Extreme poses – Poses most people would have trouble making. Not just because people have trouble making them, but because most of the skeleton SDKs are not trained for unusual body positions.
  • Sitting – It’s is generally not handled well across skeleton SDKs. The overall skeleton becomes a lot less trustworthy.

That’s Normal Right?

All the skeleton SDKs I’ve used so far don’t generally return you anything other than the rawest of the raw bone positions. Which is generally a good thing; you wouldn’t want them to hide the raw data from you. However, this will tends to result in moments when your hand will penetrate your chest, your knee will flip backwards and you’ll have your leg behind your back.

So it becomes important to try and avoid these events by using joint constraints. Even though the skeleton SDKs usually have bone confidence numbers, they’re not comparing confidence based upon how a normal human can move. It’s based on can they clearly see something they think is a body part. If so they will report things like, 100% confident your leg has driven itself up into your chest.

Time

Timing is very difficult. The user has to predict how long he is going to take to move, while at the same time accounting for how long the avatar will take to move, plus how long the gesture detection will take to detect his action. Making it very hard for him to predict when he has to jump or duck or move to the side.

In these situations feedback that he has done the right thing, as well as how long he has left to do the right thing can be important. One handy trick when compressing timespans to play back animation is Bullet-Time. Imagine a player running and jumping hurdles. There’s this unknown zone that once entered there will not be enough time to playback the animation to jump the hurdle without it looking sped up bizarrely fast. However with bullet time, if you detect the gesture just in time, you can slow down time long enough to play back the animation and also indicate to the user, “Hey, you almost missed that one”. Bullet-Time is also handy for just giving the user more time to make a split second decision, and then as soon as they’ve made it, speed back up.

Just a Little Bit Closer…

Depth perception is another frustrating problem. Users have really poor perception about how far away objects are from their avatar that they can interact with. Luckily there are many ways around this problem.

  • Depth Cues – shadows do a great job of helping to show distance as you get closer to an object
  • UI Visual Cues – Visual feedback that you can now interact with the object is important. If I’m playing a volleyball game, changing the halo around the ball from red to green to indicate I can now jump and hit it can be valuable feedback, because it’s hard estimating how high my character can jump, or when they can jump.
  • Camera angle is everything. Having the right angle to the object can make it much easier to tell depth.
  • Audio Cues – I don’t see these get used very often, but sound is a great way to indicate action is required, or success or failure on the user’s part.

Commentary: Medical Imaging, Hadoukens, and More

We’ve obviously been following what’s been going on in the hobbyist community around Kinect and we thought that it might be valuable to share links more often and add our own commentary to what is going on. With that in mind, we’re serving up a fresh batch of links and commentary this morning.

Kinect Based Medical Image Exploration:
YouTube Preview Image
http://kinecthacks.net/kinect-based-medical-image-exploration

This is definitely one of the more polished 2D user interfaces we’ve seen on Kinect and makes for a pretty compelling experience. Their calibration phase is pretty interesting. They have to do the “psi” pose to be recognized by OpenNI, but afterwards they have to do two things:

  • Point from the top left corner of the screen to bottom right with their hand. We presume this is to define the “working window” in 3D space to map to the onscreen window dimensions
  • Hold their hand open for 1 second and then closed. This is likely so that they can build some sort of shape matching benchmark for open and closed hands. We’d be interested to see if they were doing any machine learning under the covers like we are or if their system is purely analytical (ie size of hand in depth map).

Two of the more unique aspects to their system was the use of eye tracking and pointing to do the picking (described at 2:08 in the video) and the region-of-interest selection (shown at 2:33 in the video).

Microsoft’s Official Kinect SDK
YouTube Preview Image
We’ve been expecting Microsoft to eventually release their SDK on PC. With Kinect Hacks getting so much attention, it just makes perfect sense for them to do this. The most interesting thing to me is that this initial SDK will be for noncommercial use only. This points to some desire to commercialize the SDK in the future, which Microsoft hasn’t really done before. All of the Windows SDKs, DirectX, etc are all freely available without any noncommercial restrictions. Our hope is that this gets quickly resolved, because we love working with the Kinect SDK on Xbox 360 and would be interested in using it on PC as well.

Kinect Hack: Street Fighter 4
YouTube Preview Image
http://kinecthacks.net/street-fighter-iv-motion-kinect-edition/
This was an inevitability. When you first saw motion control, you can’t tell me that you didn’t want to make Ryu do a Hadouken! This uses the popular FAAST toolkit, which seems to have a suite of analytical gesture recognizers that can be mapped to custom application verbs. By analytical gesture recognizers, we mean that each recognizer is a custom algorithm based on joint angles and positions with a specific tolerance. We do similar things in our SDK as well as more generalized gesture recognition. A very cool project and we wish them great success in coming up with custom moves per character!

Razer Hydra Brings Motion Control to the Masses
YouTube Preview Image
http://www.gizmag.com/razer-hydra-motion-controller-pc-games/18472/
For those of you that still cry out for buttons with your motion gaming, the Razer Hydra motion controller looks like an interesting bet, especially since they have the attention of Valve. For $139.99, you’ll be able to purchase a base station with two controllers and a special edition of Portal 2. Several of us just completed Portal 2 in the office and we’re definitely curious about what new goodness this version brings. The levels that they showed at CES had stretchable blocks to create custom-sized bridges or shields which was pretty cool.

ICM 1.0 Released!

We’re very excited to share more information about the release of Intelligent Character Motion 1.0. We’ve put together a brief product introduction video:

In addition to the features in our previous release, we’ve added some amazing new things many of which are shown in the video above:

  • Full Xbox 360 support including our new Rooftop Race sample.
  • Support for multiple mimic targets per virtual world object. e.g., Walking and running associated with floors.
  • A new model for limiting the strength of the user when hanging and engaging with the virtual world.
  • Scene file format and example export pipeline from Max using FBX.
  • New visualizations for debugging.
  • Improved stability and multi-user support for our inference-based grasping solution.
  • Numerous other improvements, too many to list here.

Interested developers should contact us to set up an eval.

Also, if you’re at the East Coast Game Conference this week, come talk to us about ICM!

Recent Articles

We’ve been busy at Activate3D with a number of press releases coming out recently, and we had the release of our free community edition. As a result, a number of sites have linked to us and our videos which is exciting. It’s always nice to have your work recognized. Here are the big ones that had some interesting comments:

Our favorite quote was on the Joystiq article:

“Dear ::insert major publishing/development studio here::,
Hire this guy and make something legendary!
Sincerely, Potential Kinect Customer”

We agree! We’d just change the quote a bit to say:

“Dear all developers and publishers,
ICM is available for licensing. Talk to us at GDC and make something legendary!
Sincerely, The Activate Engineering Team.”

The momentum is really picking up. It’s going to be an exciting few months, and it’s all starting with GDC. Stay tuned for more annoucements as the show closes in on us.