The Imaginative Universal

Studies in Virtual Phenomenology -- by @jamesashley, Kinect MVP and author

Emgu, Kinect and Computer Vision

monalisacv

Last week saw the announcement of the long awaited OpenCV 3.0 release, the open source computer vision library originally developed by Intel that allows hackers and artists to analyze images in fun, fascinating and sometimes useful ways. It is an amazing library when combined with a sophisticated camera like the Kinect 2.0 sensor. The one downside is that you typically need to know how to work in C++ to make it work for you.

This is where EmguCV comes in. Emgu is a .NET wrapper library for OpenCV that allows you to use some of the power of OpenCV on .NET platforms like WPF and WinForms. Furthermore, all it takes to make it work with the Kinect is a few conversion functions that I will show you in the post.

Emgu gotchas

The first trick is just doing all the correct things to get Emgu working for you. Because it is a wrapper around C++ classes, there are some not so straightforward things you need to remember to do.

1. First of all, Emgu downloads as an executable that extracts all its files to your C: drive. This is actually convenient since it makes sharing code and writing instructions immensely easier.

2. Any CPU isn’t going to cut it when setting up your project. You will need to specify your target CPU architecture since C++ isn’t as flexible about this as .NET is. Also, remember where your project’s executable is being compiled to. For instance, an x64 debug build gets compiled to the folder bin/x64/Debug, etc.

3. You need to grab the correct OpenCV C++ library files and drop them in the appropriate target project file for your project. Basically, when you run a program using Emgu, your executable expects to find the OpenCV libraries in its root directory. There are lots of ways to do this such as setting up pre-compile directives to copy the necessary files. The easiest way, though, is to just go to the right folder, e.g. C:\Emgu\emgucv-windows-universal-cuda 2.4.10.1940\bin\x64, copy everything in there and paste it into the correct project folder, e.g. bin/x64/Debug. If you do a straightforward copy/paste, just remember not to Clean your project or Rebuild your project since either action will delete all the content from the target folder.

4. Last step is the easiest. Reference the necessary Emgu libraries. The two base ones are Emgu.CV.dll and Emgu.Util.dll. I like to copy these files into a project subdirectory called libs and use relative paths for referencing the dlls, but you probably have your own preferred way, too.

WPF and Kinect SDK 2.0

I’m going to show you how to work with Emgu and Kinect in a WPF project. The main difficulty is simply converting between image types that Kinect knows and image types that are native to Emgu. I like to do these conversions using extension methods. I provided these extensions in my first book Beginning Kinect Programming about the Kinect 1 and will basically just be stealing from myself here.

I assume you already know the basics of setting up a simple Kinect program in WPF. In MainWindow.xaml, just add an image to the root grid and call it rgb:

<Grid> 
    <Image x:Name="rgb"></Image> 
</Grid> 

 

Make sure you have a reference to the Microsoft.Kinect 2.0 dll and put your Kinect initialization code in your code behind:

KinectSensor _sensor;
ColorFrameReader _rgbReader;


private void InitKinect()
{
    _sensor = KinectSensor.GetDefault();
    _rgbReader = _sensor.ColorFrameSource.OpenReader();
    _rgbReader.FrameArrived += rgbReader_FrameArrived;
    _sensor.Open();
}

public MainWindow()
{
    InitializeComponent();
    InitKinect();
}

 

protected override void OnClosing(System.ComponentModel.CancelEventArgs e)
{
    if (_rgbReader != null)
    {
        _rgbReader.Dispose();
        _rgbReader = null;
    }
    if (_sensor != null)
    {
        _sensor.Close();
        _sensor = null;
    }
    base.OnClosing(e);
}

 

Kinect SDK 2.0 and Emgu

 

You will now just need the extension methods for converting between Bitmaps, Bitmapsources, and IImages. In order to make this work, your project will additionally need to reference the System.Drawing dll:

static class extensions
{

    [DllImport("gdi32")]
    private static extern int DeleteObject(IntPtr o);


    public static Bitmap ToBitmap(this byte[] data, int width, int height
        , System.Drawing.Imaging.PixelFormat format = System.Drawing.Imaging.PixelFormat.Format32bppRgb)
    {
        var bitmap = new Bitmap(width, height, format);

        var bitmapData = bitmap.LockBits(
            new System.Drawing.Rectangle(0, 0, bitmap.Width, bitmap.Height),
            ImageLockMode.WriteOnly,
            bitmap.PixelFormat);
        Marshal.Copy(data, 0, bitmapData.Scan0, data.Length);
        bitmap.UnlockBits(bitmapData);
        return bitmap;
    }

    public static Bitmap ToBitmap(this ColorFrame frame)
    {
        if (frame == null || frame.FrameDescription.LengthInPixels == 0)
            return null;

        var width = frame.FrameDescription.Width;
        var height = frame.FrameDescription.Height;

        var data = new byte[width * height * PixelFormats.Bgra32.BitsPerPixel / 8];
        frame.CopyConvertedFrameDataToArray(data, ColorImageFormat.Bgra);

        return data.ToBitmap(width, height);
    }

    public static BitmapSource ToBitmapSource(this Bitmap bitmap)
    {
        if (bitmap == null) return null;
        IntPtr ptr = bitmap.GetHbitmap();
        var source = System.Windows.Interop.Imaging.CreateBitmapSourceFromHBitmap(
        ptr,
        IntPtr.Zero,
        Int32Rect.Empty,
        System.Windows.Media.Imaging.BitmapSizeOptions.FromEmptyOptions());
        DeleteObject(ptr);
        return source;
    }

    public static Image<TColor, TDepth> ToOpenCVImage<TColor, TDepth>(this ColorFrame image)
        where TColor : struct, IColor
        where TDepth : new()
        {
            var bitmap = image.ToBitmap();
            return new Image<TColor, TDepth>(bitmap);
        }

    public static Image<TColor, TDepth> ToOpenCVImage<TColor, TDepth>(this Bitmap bitmap)
        where TColor : struct, IColor
        where TDepth : new()
    {
        return new Image<TColor, TDepth>(bitmap);
    }

    public static BitmapSource ToBitmapSource(this IImage image)
    {
        var source = image.Bitmap.ToBitmapSource();
        return source;
    }
   
}

Kinect SDK 2.0 and Computer Vision

 

Here is some basic code to use these extension methods to extract an Emgu IImage type from the ColorFrame object each time Kinect sends you one and then convert the IImage back into a BitmapSource object:

void rgbReader_FrameArrived(object sender, ColorFrameArrivedEventArgs e)
{
    using (var frame = e.FrameReference.AcquireFrame())
    {
        if (frame != null)
        {
            var format = PixelFormats.Bgra32;
            var width = frame.FrameDescription.Width;
            var height = frame.FrameDescription.Height;
            var bitmap = frame.ToBitmap();
            var image = bitmap.ToOpenCVImage<Bgr,byte>();

            //do something here with the IImage 
            //end doing something


            var source = image.ToBitmapSource();
            this.rgb.Source = source;

        }
    }
}

 

face_capture

You should now be able to plug in any of the sample code provided with Emgu to get some cool CV going. As an example, in the code below I use the Haarcascade algorithms to identify heads and eyes in the Kinect video stream. I’m sampling the data every 10 frames because the Kinect is sending 30 frames a second while the Haarcascade code can take as long as 80ms to process. Here’s what the code would look like:

int frameCount = 0;
List<System.Drawing.Rectangle> faces;
List<System.Drawing.Rectangle> eyes;

void rgbReader_FrameArrived(object sender, ColorFrameArrivedEventArgs e)
{
    using (var frame = e.FrameReference.AcquireFrame())
    {
        if (frame != null)
        {
            var format = PixelFormats.Bgra32;
            var width = frame.FrameDescription.Width;
            var height = frame.FrameDescription.Height;
            var bitmap = frame.ToBitmap();
            var image = bitmap.ToOpenCVImage<Bgr,byte>();

            //do something here with the IImage 
            int frameSkip = 10;
            //every 10 frames

            if (++frameCount == frameSkip)
            {
                long detectionTime;
                faces = new List<System.Drawing.Rectangle>();
                eyes = new List<System.Drawing.Rectangle>();
                DetectFace.Detect(image, "haarcascade_frontalface_default.xml", "haarcascade_eye.xml", faces, eyes, out detectionTime);
                frameCount = 0;
            }

            if (faces != null)
            {
                foreach (System.Drawing.Rectangle face in faces)
                    image.Draw(face, new Bgr(System.Drawing.Color.Red), 2);
                foreach (System.Drawing.Rectangle eye in eyes)
                    image.Draw(eye, new Bgr(System.Drawing.Color.Blue), 2);
            }
            //end doing something


            var source = image.ToBitmapSource();
            this.rgb.Source = source;

        }
    }
}

HoloCoding resources from Build 2015

darren! 

The Hololens team has stayed extremely quiet over the past 100 days in order to have a greater impact at Build. Alex Kipman was the cleanup batter on the first day keynote at Build with an amazing overview of realistic Hololens scenarios. This was followed by Hololens demos as well as private tutorials on using Hololens with a version of Unity 3D. Finally there were sessions on Hololens and a pre-recorded session on using the Kinect with the brand new RoomAlive Toolkit.

darren?

Here are some useful links:

 

There were two things I found particularly interesting in Alex Kipman’s first day keynote presentation.

microsoft-hololens-build-anatomy

The first was the ability of the onstage video to actually capture what was being shown through the hololens but from a different perspective. The third person view of what the person wearing the hololens even worked when the camera moved around the room. Was this just brilliant After Effects work perfectly synced to the action onstage? Or were we seeing a hololens-enabled camera at work? If the latter – this might be even more impressive than the hololens itself.

mission_impossible_five

Second, when demonstrating the ability to pin movies to the wall using Hololens gestures, why was the new Mission Impossible trailer used for the example? Wouldn’t something from, say, The Matrix been much more appropriate.

Perhaps it was just a licensing issue, but I like to think there was a subtle nod to the inexplicable and indirect role Tom Cruise has played in the advancement of Microsoft’s holo-technologies. Minority Report and the image of Cruise wearing biking gloves with his arms raised in the air, conductor-like, was the single most powerful image invoked with Microsoft first introduced the Kinect sensor. As most people know by now, Alex Kipman was the man responsible not only for carrying the Kinect nee Natal Project to success, but now for guiding the development of the Hololens. Perhaps showing Tom Cruise onstage at Build was a subtle nod to this implicit relationship.

The HoloCoder’s Bookshelf

WP_20150430_06_43_49_Pro

Professions are held together by touchstones such as as a common jargon that both excludes outsiders and reinforces the sense of inclusion among insiders based on mastery of the jargon. On this level, software development has managed to surpass more traditional practices such as medicine, law or business in its ability to generate new vocabulary and maintain a sense that those who lack competence in using the jargon simply lack competence. Perhaps it is part and parcel with new fields such as software development that even practitioners of the common jargon do not always understand each other or agree on what the terms of their profession mean. Stack Overflow, in many cases, serves merely as a giant professional dictionary in progress as developers argue over what they mean by de-coupling, separation of concerns, pragmatism, architecture, elegance, and code smell.

Cultures, unlike professions, are held together not only by jargon but also by shared ideas and philosophies that delineate what is important to the tribe and what is not. Between a profession and a culture, the members of a professional culture, in turn, share a common imaginative world that allows them to discuss shared concepts in the same way that other people might discuss their favorite TV shows.

This post is an experiment to see what the shared library of augmented reality and virtual reality developers might one day look like. Digital reality development is a profession that currently does not really exist but which is already being predicted to be a multi-billion dollar industry by 2020.

HoloCoding, in other words, is a profession that exists only virtually for now. As a profession, it will envelop concerns much greater than those considered by today’s software developers. Whereas contemporary software development is mostly about collecting data, reporting on data and moving data from point A to points B and C, spatial software development will be more concerned with environments and will have to draw on complex mathematics as well as design and experiential psychology. The bookshelf of a holocoder will look remarkably different from that of a modern data coder. Here are a few ideas regarding what I would expect to find on a future developer’s bookshelf in five to ten years.

 

1. Understanding Media by Marshall McLuhan – written in the 60’s and responsible for concepts such as ‘the global village’ and hot versus cool media, McLuhan pioneered the field of media theory.  Because AR and VR are essentially new media, this book is required reading for understanding how these technologies stand side-by-side with or perhaps will supplant older media.

2. Illuminations by Walter Benjamin – while the whole work is great, the essay ‘The Work of Art in the Age of Mechanical Reproduction’ is a must read for discussing how traditional notions about creativity fit into the modern world of print and now digital reproduction (which Benjamin did not even know about). It also deals at an advanced level with how human interactions work on stage versus film and the strange effect this creates.

3. Sketching User Experiences by Bill Buxton – this classic was quickly adopted by web designers when it came out. What is sometimes forgotten is that the book largely covers the design of products and not websites or print media – products like those that can be built with HoloLens, Magic Leap and Oculus Rift. Full of insights, Buxton helps his readers to see the importance of lived experience when we design and build technology.

4. Bergsonism by Gilles Deleuze – though Deleuze is probably most famous for his collaborations with Felix Guattari, this work on the philosophical meaning of the term ‘’virtual reality’, not as a technology but rather as a way of approaching the world, is a gem.

5. Passwords by Jean Baudrillard – what Deleuze does for virtual reality, Baudrillard does for other artifacts of technological language in order to show their place in our mental cosmology. He also discusses virtual reality along the way, though not as thoroughly.

6. Mathematics for 3D Game Programming and Computer Graphics by Eric Lengeyl – this is hardcore math. You will need this. You can buy it used online for about $6. Go do that now.

7. Linear Algebra and Matrix Theory by Robert Stoll – this is a really hard book. Read the Lengeyl before trying this. This book will hurt you, by the way. After struggling with a page of this book, some people end up buying the Manga Guide to Matrix Theory thinking that there is a fun way to learn matrix math. Unfortunately, there isn’t and they always come back to this one.

8. Phenomenology of Perception by Maurice Merleau-Ponty – when it first came out, this work was often seen as an imitation of Heiddeger’s Being and Time. It may be the case that it can only be truly appreciated today when it has become much clearer, thanks to years of psychological research, that the mind reconstructs not only the visual world for us but even the physical world and our perception of 3D spaces. Merleau-Ponty pointed this out decades ago and moreover provides a phenomenology of our physical relationship to the world around us that will become vitally important to anyone trying to understand what happens when more and more of our external world becomes digitized through virtual and augmented reality technologies.

9. Philosophers Explore the Matrix – just as The Matrix is essential viewing for anyone in this field, this collection of essays is essential reading. This is the best treatment available of a pop theme being explored by real philosophers – actually most of the top American philosophers working on theories of consciousness in the 90s. Did you ever think to yourself that The Matrix raised important questions about reality, identity and consciousness? These professional philosophers agree with you.

10. Snow Crash by Neal Stephenson – sometimes to understand a technology, we must extrapolate and imagine how that technology would affect society if it were culturally pervasive and physically ubiquitous. Fortunately Neal Stephenson did that for virtual reality in this amazing book that combines cultural history, computer theory and a fast paced adventure.