The Imaginative Universal

Studies in Virtual Phenomenology -- @jamesashley

Kinect Application Project Template

June 18
by James Ashley 18. June 2013 13:21

Over the past year, every time I start a new Kinect for Windows project, I’ve basically just copied the infrastructure code from a previous project.  The starting point was the code my friend Jarrett Webb and I wrote for our book Beginning Kinect Programming with the Microsoft Kinect SDK, but I’ve made incremental improvements to this code as needed and based on pointers I’ve found in various places.  I finally realized that I’d made enough changes and it was time to just turn this base code into a project template for myself and my colleagues at work.  Realizing that there wasn’t a Kinect Application project template available yet on the visual studio gallery, I uploaded it there, also.

The cool thing about templates uploaded to the gallery is that anyone with visual studio can now install it from the IDE.  If you select Tools | Extension Manager … and then search for “Kinect” under the Online Gallery, you should see something like this.  From here you can install the Kinect Application project template to your computer.

Kinect Application Project Template

If you then create a new project and look under C# | Windows, you will be able to build a Kinect WPF application with a bit of a headstart.  Here are some key features:

1. Initialization Code

All the initialization code and Kinect stream event handlers are stubbed out in the InitSensor method.  All you need to do is uncomment the streams you want to use.  Additionally, the event handler code is also stubbed out with the proper pattern for opening and disposing of frame objects.  Whatever you need to do with the image, depth and skeleton frames can be done inside those using statements.  This code also uses the latest agreed upon best practices for efficiently managing streamed data as of the 1.7 SDK.

void sensor_ColorFrameReady(object sender
    , ColorImageFrameReadyEventArgs e)
{
    using (ColorImageFrame frame = e.OpenColorImageFrame())
    {
        if (frame == null)
            return;

        if (_colorBits == null) _colorBits = 
            new byte[frame.PixelDataLength];
        frame.CopyPixelDataTo(_colorBits);

        throw new NotImplementedException();
    }
}

2. Disposal Code

Whatever you enable in the InitSensor method you will need to disable and dispose of in the DeInitSensor method.  Again, this just requires uncommenting the appropriate lines.  The DeInitSensor also implements a disposal pattern that is somewhat popular now.  The sensor is actually shut down on a background thread rather than on the main thread.  I’m not sure if this is a best practice as such, but it resolves a problem many C# developers were running into in shutting down their Kinect-enabled applications.

3. Status Changed Code

The Kinect can actually be disconnected in mid-process or simply not be on when you first run an application.  It is also surprisingly common to forget to plug the Kinect’s power supply in.  Generally, your application will just crash in such situations.  If you properly handle the KinectSensors.StatusChanged event, however, your application will just start up again when you get the sensor plugged back in.  A pattern for doing this was first introduced in the KinectChooser component in the Developer Toolkit.  A lightweight version of this pattern is included in the Kinect Application Project Template.

void KinectSensors_StatusChanged(object sender
    , StatusChangedEventArgs e)
{
    if (e.Status == KinectStatus.Disconnected)
    {
        if (_sensor != null)
        {
            DeInitSensor(_sensor);
        }
    }

    if (e.Status == KinectStatus.Connected)
    {
        _sensor = e.Sensor;
        InitSensor(_sensor);
    }
}
 

4. Extension Methods

While most people were working on controls for the Kinect 4 Windows SDK, Clint Rutkas and the Coding4Fun guys brilliantly came up with the idea of developing extension methods for handling the various Kinect streams.

The extension methods included with this template provide lots of conversions from bitmaps byte arrays to BitmapSource types (useful for WPF image controls) and vice-versa.  This allows you to do something easy like display a color stream which otherwise can be rather hairy.  The snippet below assumes there is an image control in the MainWindow named canvas.

using (ColorImageFrame frame = e.OpenColorImageFrame())
{
    if (frame == null)
        return;

    if (_colorBits == null) _colorBits = 
        new byte[frame.PixelDataLength];
    frame.CopyPixelDataTo(_colorBits);

    // new line
    this.canvas.Source = 
        _colorBits.ToBitmapSource(PixelFormats.Bgr32, 640, 480);
}

More in line with the original Coding4Fun Toolkit, the extension methods also make some very difficult scenarios trivial – for instance background subtraction (also known as green screening), skeleton drawing, player masking.  These methods should make it easier for quickly mock up a demo or even show off the power of the Kinect in the middle of a presentation using just a few lines of code.

private void InitSensor(KinectSensor sensor)
{
    if (sensor == null)
        return;

    sensor.ColorStream.Enable();
    sensor.DepthStream.Enable();
    sensor.SkeletonStream.Enable();
    sensor.Start();
    this.canvas.Source = sensor.RenderActivePlayer();
}

Again, this code assumes there is an image control in MainWindow named canvas.  You’ll want to put the following code in the InitSensor method to ensure that the code is called again if your Kinect sensor accidentally gets dislodged.  To create a simple background subtraction image, enable the color, depth and skeleton streams and then call the RenderActivePlayer extension method.  By stacking another image beneath the canvas image, I create an effect like this:

me on tatooine

Here are some overloads of the RenderActivePlayer method and the effects they create.  I’ve removed Tatooine from the background in the following samples.

 

canvas.Source = sensor.RenderActivePlayer(System.Drawing.Color.Blue);
blue_man

 

canvas.Source = sensor.RenderActivePlayer(System.Drawing.Color.Blue
                , System.Drawing.Color.Fuchsia);

blue_fuschia

 

canvas.Source = sensor.RenderActivePlayer(System.Drawing.Color.Transparent
                , System.Drawing.Color.Fuchsia);

trasnparent_fuschia

 

And so on.  There’s also this one:

canvas.Source = sensor.RenderPredatorView();

predator

 

… as well as this oldie but goodie:

canvas.Source = sensor.RenderPlayerSkeleton();
skeleton_view

The base method uses the colors (and quite honestly most of the code) from the Kinect Toolkit that goes with the SDK.  As with the RenderActivePlayer extension method, however, there are lots of overrides so you can change all the colors if you wish to.

canvas.Source = sensor.RenderPlayerSkeleton(System.Drawing.Color.Turquoise
    , System.Drawing.Color.Indigo
    , System.Drawing.Color.IndianRed
    , trackedBoneThickness: 1
    , jointThickness: 10);

balls

 

Finally, you can also layer all these different effects:

canvas.Source = sensor.RenderActivePlayer();
canvas2.Source = sensor.RenderPlayerSkeleton(System.Drawing.Color.Transparent);
everything

Tags: ,

Kinect

PRISM, Xbox One Kinect, Privacy and Semantics

June 07
by James Ashley 7. June 2013 12:54

loose lips

It’s interesting that at one time getting people to keep quiet was a priority for the government.  During World War II the government promoted a major advertising campaign to remind people that “loose lips sink ships.”  During war time (back when wars were temporary affairs), it was standard practice to suppress the flow of information and censor personal letters to ensure that useful information would not fall into enemy hands.  In a sense, privacy and national security were one.

Recent leaks about the NSA’s PRISM program suggest that things have dramatically changed.  We’ve realized for several years now that our cell phone service providers, our social networks, and our search engines are constantly tracking our physical and digital movements and mining that data for marketing.  We basically have traded our privacy for convenience in the same way that we accept ads on TV and on the Internet in exchange for free content. 

The dark side of all this is when all of this information is being passed along to third parties we didn’t even know about until we start getting junk mail in our inboxes for products we have no interest in.

What we only suspected, until now, was that the infrastructure that has been built to support these transactions of personal information for services were also of interest to our government and that we are sharing our identifying information not only with content providers, service providers, spammers and junk mailers but also with the United States security apparatus.  Now that all that information has been collected, the government wants to mine it also.

We don’t live in a police state today.  I don’t belong to either the far right wing nor the far left wing – I’m neither an occupier nor a tea partay kind of guy – so I also don’t believe we are even close to slipping into a police state in the near future.  I’m not concerned that the government will or ever will use this information to track me down and I am pretty confident that all this data mining will mainly be used only to track down terrorists and to send me unwanted emails.  And yet, it bugs me on a visceral level that people are going through my stuff, whatever that ethereal stuff actually is.

Gears of War

The main argument against this cooties feeling about my privacy is that only metadata is being inspected and not actual content.  Unfortunately, this seems like a porous boundary to me.  To paraphrase Hegel’s overarching criticism of Kant, whenever we draw a line we also necessarily have to cross over it at the same time.  From everything I know about software, the only way to gather metadata is to inspect the content in order to generate metadata about it.  For instance, when a government computer system listens to phone traffic in order to pick out key words and constellations of words, it still has to listen to all the other words first in order to pick out what it is interested in. 

Moreover, according to Slate, the data mining being done by PRISM is incredibly broad:

It appears the National Security Agency’s sweeping surveillance is not something only Verizon customers should be concerned about. The agency has also reportedly obtained access to the central servers of major U.S. Internet companies as part of a secret program that involves the monitoring of emails, file transfers, photos, videos, chats, and even live surveillance of search terms.

The semantics of Privacy today, as defined under the regime of the NSA, doesn’t mean no one is listening to what you are saying – it just means no one cares.  The best way to protect one’s privacy today is to simply be boring.

At the same time that all these revelations about PRISM were coming out (in fact on the very same day), Microsoft released a brief about privacy concerns around the new Xbox One’s Kinect peripheral.  Here’s an attempted explanation of the brief on Windows Phone Central I found particularly fascinating:

A lot of people feared that the Kinect would be able to listen to you when the Xbox One was off. Apparently, when off, the Xbox One is only listening for one command in its low-power state: “Xbox On”. It’s nice to know that you’re in control when the Kinect is on, off or paused. Some games though will require Kinect functionality (again, at the discretion of the game developers/publisher). That’s up to you to play or not play those games.

 

The author’s reassurance is based on a semantic sleight-of-hand.  The Kinect is not listening to you, according to the author, because it “is only listening for one command.”  This is an honest mistake, but a dangerous one.  In fact, in order to listen for one command, the Kinect has to have that microphone turned on and listening to everything anyone is saying.  What it is actually doing is only acting on one command – and hopefully throwing away everything else.  Additionally I do have a bit of experience with Microsoft’s speech recognition technology both on the Kinect and on the PC, and the “low-power state” modifier doesn’t particularly make sense.  It takes a similar amount of effort to identify insignificant data as it does to identify significant data, AFAIK. (There’s always the possibility that the Xbox Kinect has an on-board language processor just to listen for this one command that is separate from the rest of its speech recognition processing chain – but I haven’t heard about anything like that so far.)

Halo IV

The original Microsoft brief called Privacy by Design, upon which I assume the Windows Phone Central post is based, doesn’t play this particular semantic game – though it plays another.  At the same time, it also seems particularly and intentionally vague about certain points.

The semantic game in Microsoft’s Privacy post is around the term ‘design’.  Does design  here refer to the hardware design, the software architecture, the usability design or the marketing campaign?  These are all things that are encompassed by the term design and, in the linked article, privacy could be referring to any of them.  If it refers to the marketing campaign and UX, as it probably does, this doesn’t actually provide me any guarantees of privacy.  All it tells me is that Microsoft doesn’t initially intend to use the new Kinect sitting in my living room to collect random conversations.  ‘Design’ may refer to the initial software architecture, but this doesn’t provide us with any particular guarantees since any post-release software update can change the way the software works.

To put this another, way, the article describes Microsoft’s intent but doesn’t provide any guarantees.  Is there anything in the hardware that will prevent speech data from being mined in the future?  Probably not.  In that case, is there anything in the licensing that prevents Microsoft from mining this data?  Microsoft’s privacy brief doesn’t even touch on this.

So should you be concerned?  Totally – and here’s why.  In its pursuit of security, the NSA has instituted an infrastructure that performs better and better the more information it is fed.  Do terrorists play Xbox?  I have no idea.  Would the NSA want all that data anyways? 

Call of Duty

Hypothetically, the new Xbox One and the Kinect can collect this information on us.  Here’s how.  According to recent Microsoft announcements, the Xbox One must be connected to the Internet once every 24 hours in order to play games on it.  The new Kinect is designed to always be on and I am obligated to have it (I can’t buy a Kinect One without it).  Even when my Xbox One is off, my Kinect is still on listening for a command to turn it on.  The infrastructure is there and the NSA’s PRISM project is a monster that is hungry for it.

To be clear, I don’t think Microsoft is particularly interested in collecting this data.  Microsoft has no particular use for the typically rather boring conversations I have in my living room.  They won’t be gleaning any particularly useful marketing information from my conversations either. 

Nevertheless, I think it would be extremely forward looking of Microsoft to explain what they have put in place to prevent the government from ever issuing a request for this data and getting it the way they have already gotten other data, so far, from Verizon, AT&T, Microsoft, Yahoo, Google, Facebook, AOL, Skype, YouTube, and Apple.

Has Microsoft designed a mechanism, either through hardware or through a customer agreement they won’t/can’t rescind in the future, that will future proof my privacy?

Tags:

Kinect | Spies

Helpful vs Creepy Face Recognition

February 13
by James Ashley 13. February 2013 14:03

mr_mall

One of the interesting potential commercial uses for the Kinect for Windows sensor is as a realtime tool for collecting information about people passing by.  The face detection capabilities of the Kinect for Windows SDK lends itself to these scenarios.  Just as Google and Facebook currently collect information about your browsing habits, Kinects can be set up in stores and malls to observe you and determine your shopping habits.

There’s just one problem with this.  On the face of it, it’s creepy.

To help parse what is happening in these scenarios, there is a sophisticated marketing vocabulary intended to distinguish “creepy” face detection from the useful and helpful kind.

First of all, face detection on its own does little more than detect that there is a face in front of the camera.  The face detection algorithm may go even further and break down parts of the face into a coordinate system.  Even this, however, does not turn a particular face into an token that can be indexed and compared against other faces. 

Turning an impression of a face into some sort of hash takes us to the next level and becomes face recognition rather than merely detection.  But even here there is parsing to be done.  Anonymous face recognition seeks to determine generic information about a face rather than specific, identifying information.  Anonymous face recognition provides data about a person’s age and gender – information that is terribly useful to retail chains. 

Consider that today, the main way retailers collect this information is by placing a URL at the bottom of a customer’s receipt and asking them to visit the site and provide this sort of information when the customer returns home.  The fulfillment rate on this strategy is obviously horrible.

Being able to collect these information unobtrusively would allow retailers to better understand how inventory should be shifted around seasonally and regionally to provide customers with the sorts of retail items they are interested in.  Power drills or perfume?  The Kinect can help with these stocking questions.

But have we gotten beyond the creepy factor with anonymous face recognition?  It actually depends on where you are.  In Asia, there is a high tolerance for this sort of surveillance.  In Europe, it would clearly be seen as creepy.  North America, on the other hand, is somewhere between Europe and Asia on privacy issues.  Anonymous face recognition is non-creepy if customers are provided with a clear benefit from it – just as they don’t mind having ads delivered to their browsers as long as they know that getting ads makes other services free.

Finally, identity face recognition in retail would allow custom experiences like the virtual ad delivery system portrayed in the mall scene from The Minority Report.  Currently, this is still considered very creepy.

At work, I’ve had the opportunity to work with NEC, IBM and other vendors on the second kind of face recognition.  The surprising thing is that getting anonymous face recognition working correctly is much harder than getting full face recognition working.  It requires a lot of probabilistic logic as well as a huge database of faces to get any sort of accuracy when it comes to demographics.  Even gender is surprisingly difficult.

Identity face recognition, on the other hand, while challenging, is something you can have in your living room if you have an XBox and a Kinect hooked up to it.  This sort of face recognition is used to log players automatically into their consoles and can even distinguish different members of the same family (for engineers developing facial recognition software, it is an irritating quirk of fate that people who look alike also tend to live in the same house).

If you would like to try identity face recognition out, you can try out the Luxand Face SDK.  Luxand provides a 30-day trial license which I tried out a few months ago.  The code samples are fairly good.  While Luxand does not natively support Kinect development, it is fairly straightforward to turn data in the Kinect’s rgb stream into images which can then be compared against other images using Luxand.

I used Luxand’s SDK to compare anyone standing in front of the Kinect sensor with a series of photos I had saved.  It worked fairly well, but unfortunately only if one stood directly in front of the sensor and about a foot or two in front of it (which wasn’t quite what we needed at the time).  The heart of the code is provided below.  It simply takes color images from Kinect and compares it against a directory of photos to see if a match can be found.  It could be used as part of a system for unlocking a computer when the proper user stands in front of it (though you can probably think of better uses – just try to avoid being creepy).

void _sensor_ColorFrameReady(object sender
    , ColorImageFrameReadyEventArgs e)
{
    using (var frame = e.OpenColorImageFrame())
    {
        var image = frame.ToBitmap();
        this.image2.Source = image.ToBitmapSource();
        LookForMatch(image);
    }
 
}
 
private bool LookForMatch(System.Drawing.Bitmap currentImage)
{
 
        if (currentImage == null)
            return false;
        IntPtr hBitmap = currentImage.GetHbitmap();
        try
        {
        FSDK.CImage image = new FSDK.CImage(hBitmap);
        FSDK.SetFaceDetectionParameters(false, false, 100);
        FSDK.SetFaceDetectionThreshold(3);
        FSDK.TFacePosition facePosition = image.DetectFace();
        if (facePosition.w != 0)
        {
 
            FaceTemplate template = new FaceTemplate();
            template.templateData = 
                ExtractFaceTemplateDataFromImage(image);
            bool match = false;
            FaceTemplate t1 = new FaceTemplate();
            FaceTemplate t2 = new FaceTemplate();
            float best_match = 0.0f;
            float similarity = 0.0f;
            foreach (FaceTemplate t in faceTemplates)
            {
 
                t1 = t;
                FSDK.MatchFaces(ref template.templateData
                    , ref t1.templateData, ref similarity);
                float threshold = 0.0f;
                FSDK.GetMatchingThresholdAtFAR(0.01f
                    , ref threshold);
 
                if (similarity > best_match)
                {
                    this.textBlock1.Text = similarity.ToString();
                    best_match = similarity;
                    t2 = t1;
                    if (similarity > _targetSimilarity)
                        match = true;
                }
            }
            if (match && !_isPlaying)
            {
                return true;
            }
            else
            {
                return false;
            }
        }
        else
            return false;
    }
    finally
    {
        DeleteObject(hBitmap);
        currentImage.Dispose();
    }
}
 
private byte[] ExtractFaceTemplateDataFromImage(FSDK.CImage cimg)
{
    byte[] ret = null;
    Luxand.FSDK.TPoint[] facialFeatures;
    var facePosition = cimg.DetectFace();
    if (0 == facePosition.w)
    {
    }
    else
    {
        bool eyesDetected = false;
        try
        {
            facialFeatures = 
                cimg.DetectEyesInRegion(ref facePosition);
            eyesDetected = true;
        }
        catch (Exception ex)
        {
            return cimg.GetFaceTemplateInRegion(ref facePosition);
        }
 
        if (eyesDetected)
        {
            ret = 
                cimg.GetFaceTemplateUsingEyes(ref facialFeatures);
        }
        else
        {
            ret = cimg.GetFaceTemplateInRegion(ref facePosition);
        }
    }
    return ret;
    cimg.Dispose();
}
 

Tags:

Kinect