The Imaginative Universal

Studies in Virtual Phenomenology -- by @jamesashley, Kinect MVP and author

Top 21 HoloLens Ideas

holo-green

The image above is a best guess at the underlying technology being used in Microsoft’s new HoloLens headset. It’s not even that great a guess since the technology appears to still be in the prototype stage. On the other hand, the product is tied to the Windows 10 release date, so we may be seeing a consumer version – or at the very least a dev version – sometime in the fall.

Here are some things we can surmise about HoloLens:

a) the name may change – HoloLens is a good product name but isn’t quite where we might like it to be, in a league with Kinect, Silverlight or Surface for branding genius. In fact, Surface was such a good name, it was taken from one product group and simply given to another in a strange twist on the build vs buy vs borrow quandary. On the other hand, HoloLens sounds more official than the internal code name, Baraboo -- isn’t that a party hippies throw themselves in the desert?

johnny mnemonic

b) this is augmented reality rather than virtual reality. Facebook’s Oculus Rift, which is an immersive fully digital experience, is an example of virtual reality. Other fictional examples include The Oasis from Ernest Cline’s Ready Player One, The Mataverse from Neal Stephenson’s Snow Crash, William Gibson’s Cyberspace and the VR simulation from The Lawnmower Man. Augmented reality involves overlaying digital experience on top of the real world. This can be accomplished using holography, transparent displays, or projectors. A great example of projector based AR is the RoomAlive project by Hrvoje Benko, Eyal Ofek and Andy Wilson at Microsoft Research. HoloLens uses glasses or a head-rig – depending on how generous you feel – to implement AR. Magic Leap – with heavy investment from Google – appears to be doing the same thing. The now defunct Google Glass was neither AR nor VR, but was instead a heads-up display.

kgirl

c) HoloLens uses Kinect technology under the plastic covers. While the depth sensor in the Kinect v2 has a field of view of 70 degrees by about 60 degrees, the depth capability in HoloLens is reported to have a field of view of 120 degrees by 120 degrees. This indicates that HoloLens will be using the Time-of-Flight technology used in Kinect v2 rather than the light field from Kinect v1. This set up requires both an IR emitter as well as a depth camera combined with a sophisticated timing and phase technology to efficiently and relatively inexpensively calculate depth.

hands

d) the depth camera is being used for multiple purposes. The first is for gesture detection. One of the issues that faced both Oculus and Google Glass was that they were primarily display technologies. But a computer monitor is useless without a keyboard or mouse. Similarly, Oculus and Glass needed decent interaction metaphors. Glass relied primarily on speech commands and tapping and clicking. Oculus had nothing until their recent acquisition of the NimbleVR . NimbleVR provides a depth camera optimized for hand and finger detection over a small range. This can be mounted in front of the Oculus headset. Conceptually, this allows people to use hand gestures and finger manipulations in front of the device. A virtual hand can be created as an affordance in the virtual world of the Oculus display, allowing users to interact with virtual objects and virtual interactive menus in virtro.

The depth sensor in HoloLens would work in a similar way except that instead of a virtual hand as affordance, it’s just your hand. You will use your hand to manipulate digital objects displayed on the AR lenses or to interact with AR menus using gestures.

An interesting question is how many IR sensors are going to be on the HoloLens device. From the pictures that have been released, it looks like we will have a color camera and a depth sensor for each eye, for a total of two depth cameras and two RGB cameras located near the joint between the lenses and the headband.

holo_minecraft

e) HoloLens is also using depth data for 3d reconstruction of real world surfaces. These surfaces are then used as virtual projection surfaces for digital textures. Finally, the blitted image is displayed on the transparent lenses.

ra1

ra_2

This sort of reconstruction is a common problem in projection mapping scenarios. A great example of applying this sort of reconstruction can be found in the Halloween edition of Microsoft Research’s RoomAlive project. In the first image above, you are seeing the experience from the correct perspective. In the second image, the image is captured from a different perspective than the one that is being projected. From the incorrect perspective, it can be seen that the image is actually being projected on multiple surfaces – the various planes of the chair as well as the wall behind it – but foreshortened digitally and even color corrected to make the image appear cohesive to a viewer sitting at the correct position. One or more Kinects must be used to calibrate the projections appropriately against these multiple surfaces. If you watch the full video, you’ll see that Kinect sensors are used to track the viewer as she moves through the room and the foreshortening / skewing occurs dynamically to adjust to her changing positions.

The Minecraft AR experience being used to show the capabilities of HoloLens requires similar techniques. The depth sensor is required not only to calibrate and synchronize the digital display to line up correctly with the table and other planes in the room, but also to constantly adjust the display as the player moves around the room.

eye-tracking

f) are the display lenses stereoscopic or holographic? At this point no one is completely sure, though indications are that this is something more than the stereoscopic display technique used in the Oculus Rift. While a stereoscopic display will create the illusion of depth and parallax by creating a different image for each lens, something holographic would actually be creating multiple images per lens and smoothly shifting through them based on the location of each pupil staring through its respective lens and the orientation and position of the player’s head.

One way of achieving this sort of holographic display is to have multiple layers of lenses pressed against each other and using interference shift the light projected into each pupil as the pupil moves. It turns out that the average person’s pupils typically move around rapidly in saccades, mapping and reconstructing images for the brain, even though we do not realize this motion is occurring. Accurately capturing these motions and shifting digital projections appropriately to compensate would create a highly realistic experience typically missing from stereoscopic reconstructions. It is rumored in the industry that Magic Leap is pursuing this type of digital holography.

On the other hand, it has also been reported that HoloLens is equipped with eye-tracking cameras on the inside of the frames, apparently to aid with gestural interactions. It would be extremely interesting if Microsoft’s route to achieving true holographic displays involved eye-tracking combined with a high display refresh rate rather than coherent light interference display technology as many people assume. Or, then again, it could just be stereoscopic displays after all.

occlusion

g) occlusion is generally considered a problem for interactive experiences. For augmented reality experiences, however, it is a feature. Consider a physical-to-digital interaction in which you use your finger/hand to manipulate a holographic menu. The illusion we want to see is of the hand coming between the player’s eyes and the digital menu. The player’s hand should block and obscure portions of the menu as he interacts with it.

The difficulty with creating this illusion is that the player’s hand isn’t really between the menu and the eyes. Really, the player’s hand is on the far side of the menu, and the menu is being displayed on the HoloLens between the player’s eyes and his hand. Visually, the hologram of the menu will bleed through and appear on top of the hand.

In order to re-establish the illusion of the menu being located on the far side of the hand, we need depth-sensors to accurately map an outline of the hand and arm and then cut a hand and arm shape out of the menu where the hand should be occluding it. This process has to be repeated as the hand moves in real-time and it’s kind of a hard problem.

borg

h) misc sensors : best guess is that in addition to depth sensors, color cameras and eye-tracking cameras, we’ll also get a directional microphone, gyroscope, accelerometer and magnetometer. Some sort of 3D sound has been announced, so it makes sense that there is a directional microphone or microphone array to complement it. This is something that is available on both the Kinect v1 and Kinect v2. The gyroscope, accelerometer and magnetometer are also guesses – but the Oculus hardware has them to track quick head movements, head position and head orientation. It makes sense that HoloLens will need them also.

 bono

i) the current form factor looks a little big – bigger than the Magic Leap is supposed to be but smaller than the current Oculus dev units. The goal – really everyone’s goal, from Microsoft to Facebook to Google – is to continue to bring down the size of sensors so we can eventually have heavy glasses rather than light-weight head gear.

j) like all innovative technologies, the success of HoloLens will depend primarily on what people use it for. the myth of the killer app is probably not very useful anymore, but the notion that you need an app store to sell a device is a generally accepted universal constant. The success of the HoloLens will depend on what developers build for it and what consumers can imagine doing with it.

 

Top 21 Ideas

Many of these ideas are borrowed from other VR and AR technology. In most cases, HoloLens will simply provide a better way to implement these notions. These ideas come from movies, from art installations, and from many years working at an innovative marketing agency where we prototyped these ideas day in and day out.

 

1. Shopping

shopping

Amazon made one click shopping make sense. Shopping and the psychology of shopping changes when we make it more convenient, effectively turning instant gratification into a marketing strategy. Using HoloLens AR, we can remodel a room with virtual furniture and then purchase all the pieces on an interactive menu floating in the air in front of us when we find the configuration we want. We can try and buy virtual clothes. With a wave of the hand we can stock our pantry, stock our refrigerator … wait, come to think of it, with decent AR, do we even need furniture and clothes anymore?

2. Gaming

 illumiroom

IllumiRoom was a Microsoft project that never quite made it to product but was a huge hit on the web. The notion was to extend the XBox One console with projections that reacted to what was occurring in the game but could also extend the visuals of the game into the entire living room. IllumiRoom (which I was fortunate enough to see live the last time I was in Redmond) also uses a Kinect sensor to scan the room in order to calibrate projection mapping onto surfaces like bookshelves, tables and potted plants. As you can guess, this is the same team that came up with RoomAlive. A setup that includes a $1,500 projector and a Kinect is a bit complicated, especially when a similar effect can now be created using a single unit HoloLens.

hud

The HoloLens device could also be used for in-game Heads-Up notifications or even as a second screen. It would make a lot of sense if XBox integration is on the roadmap and would set XBox apart as the clear leader in the console wars.

3. Communication

sw

‘nuff said.

4. Home Automation

clapper

Home automation has come a long way and you can now easily turn lights on and off with your smart phone from miles away. Turning your lights on and off from inside your own house may still involve actually touching a light switch. Devices like the Kinect have the limitation that they can only sense a portion of a room at a time. Many ideas have been thrown out to create better gesture recognition sensors for the home, including using wifi signals that go through walls to detect gestures in other rooms. If you were actually wearing a gestural device around with you, this would no longer be a problem. Point at a bulb, make a fist, “put out the light, and then put out the light” to quote the Bard.

5. Education

Microsoft-future-vision

While cool visuals will make education more interesting, the biggest benefit of HoloLens for education is simple access. Children in rural areas in the US have to travel long distances to achieve a decent education. Around the world, the problem of rural education is even worse. What if educators could be brought to the children instead? This is one of the stated goals of Facebook’s purchase of Oculus Rift and HoloLens can do the same job just as well and probably better.

6. Medical Care

bodyscan

Technology can be used for interesting diagnostic and rehabilitation functions. The depth sensors that come with HoloLens will no doubt be used in these ways eventually. But like education, one of the great problems in medical care right now is access. If we can’t bring the patient to the doctor, let’s bring the GP to the patient and do regular check ups.

7. Holodeck

matrix-i-know-kung-fu

The RoomAlive project points the way toward building a Holodeck. All we have to do is replace Kinect sensors with HoloLens sensors, projectors with holographic displays, and then try now to break the HoloLens strapped to our heads as we learn Kung Fu.

8. Windows

window

Have you ever wished you could look out your window and be somewhere else? HoloLens can make that happen. You’ll have to block out natural light by replacing your windows with sheetrock, but after that HoloLens can give you any view you want.

fifteen-million-merits

But why stop at windows. You can digitize all your walls if you want, and HoloLens’ depth technology will take care of the rest.

9. Movies and Television

vr-cinema-3d

Oculus Rift and Samsung Gear VR have apps that let you watch movies in your own virtual theater. But wouldn’t it be more fun to watch a movie with your entire family? With HoloLens we can all be together on the couch but watch different things. They can watch Barney on the flatscreen while I watch an overlay of Meet the Press superimposed on the screen. Then again, with HoloLens maybe I could replace my expensive 60” plasma TV with a piece of cardboard and just watch that instead.

10. Therapy

whitenoise

It’s commonly accepted that white noise and muted colors relax us. Controlling our environment helps us to regulate our inner states. Behavioral psychology is based on such ideas and the father of behavioral psychology, B. F. Skinner, even created the Skinner box to research these ideas – though I personally prefer Wilhelm Reich’s Orgone box. With 3D audio and lenses that extend over most of your field of view, HoloLens can recreate just the right experience to block out the world after a busy day and just relax. shhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.

11. Concerts

burning-man-festival-nevada

Once a year in the Nevada desert a magical music festival is held called Baraboo. Or, I don’t know, maybe it’s held in Tennessee. In any case, getting to festivals is really hard and usually involves being around people who aren’t wearing enough deodorant, large crowds, and buying plastic bottles of water for $20. Wouldn’t it be great to have an immersive festival experience without all the things that get in the way. Of course, there are those who believe that all that other stuff is essential to the experience. They can still go and be part of the background for me.

12. Avatars

avatar-body

Gamification is a huge buzzword at digital marketing agencies. Undergirding the hype is the realization that our digital and RL experiences overlap and that it sometimes can be hard to find the seams. fast times

13. Blocking Other People’s Avatars

14. Augmented Media

15. Terminator Vision

16. Wealth Management

17. Clippit

18. Impossible UIs

19. Wiki World

20. Belief Circles

21. Theater of Memory

Congrats to NimbleVR

I had the opportunity to meet Rob Wang, Chris Twigg and Kenrick Kin of 3Gear several years ago when I was in San Francisco demoing retail experiences using the Microsoft Kinect and Surface Table at the 2011 Oracle OpenWorld conference. I had been following their work with stereoscopic finger and hand tracking with dual Kinects and sent them what was basically a fan letter and they were kind enough to send me an invitation to their headquarters.

At the time, 3Gear was co-sharing office space with several other companies in a large warehouse space. Their finger tracking technology blew me away and I came away with the impression that these were some of the smartest people I had ever met working with computer vision and the Kinect. After all, they’re basically all Phd’s with backgrounds at companies like Industrial Light and Magic and Pixar.

I’ve written about them several times on this blog and nominated them for the Kinect v2 preview progrram. I was extremely excited when Chris agreed to present at the ReMIX conference some friends and I organized in Atlanta a few years ago for designers and developers. Here is a video of Chris’s amazing talk.

Bringing 'Minority Report' to your Desk: Gestural Control Using the Microsoft Kinect - Chris Twigg from ReMIX South on Vimeo.

Since then, 3Gear have worked on the problem of finger and hand tracking on various commercial devices in multiple configurations. In October of 2014 the guys at 3Gear initiated a Kickstarter project for a sensor they had developed called Nimble Sense. Nimble Sense is a depth sensor built from commodity components that is intended to be mounted on the front of an Oculus Rift headset. It handles the difficult problem of providing a good input device for the VR system which has the obvious side-effect of preventing you from seeing your own hands.

The solution, of course, is to represent the interaction controller – in this case the user’s hands – in the virtual world itself. Leap Motion, which produces another cool finger tracking device, also is working on creating a solution for this. The advantage the 3Gear people have, of course, is that they have been working on this particular problem with particular expertise in gesture tracking – rather than merely finger tracking – as well as visualization.


After exceeding their original goal in pledges, 3Gear abruptly cancelled their kickstarter on December 11th and the official 3Gear.com website I have been going to for news updates about the company was replaced.

This is actually all good news. Nimble VR, a rebranding of 3Gear for the Nimble Sense project, has been purchased by Oculus (which in turn, you’ll recall, was purchased by Facebook several months ago for around $2 billion).

For me this is a Cinderella story. 3Gear / Nimble VR is an extremely small team of extremely smart people who have passed on much more lucrative job opportunities in order to pursue their dreams. And now they’ve achieved their much deserved big payday.

Congratulations Rob, Chris and Kenrick!

Projecting Augmented Reality Worlds

WP_20141105_11_05_56_Raw

In my last post, I discussed the incredible work being done with augmented reality by Magic Leap. This week I want to talk about implementing augmented reality with projection rather than with glasses.

To be more accurate, varieties of AR experiences are often projection based. The technical differences depend on which surface is being projected on. Google glass projects on a surface centimeters from the eye. Magic Leap is reported to project directly on the retina (virtual retinal display technology).

AR experiences being developed at Microsoft Research, which I had the pleasure of visiting this past week during the MVP Summit, are projected onto pre-existing rooms without the need to rearrange the room itself. Using fairly common projection mapping techniques combined with very cool technology such as the Kinect and Kinect v2, the room is scanned and appropriate distortions are created to make projected objects look “correct” to the observer.

An important thing to bear in mind as you look through the AR examples below is that they are not built using esoteric research technology. These experiences are all built using consumer-grade projectors, Kinect sensors and Unity 3D. If you are focused and have a sufficiently strong desire to create magic, these experiences are within your reach.

The most recent work created by this group (led by Andy Wilson and Hrvoje Benko) is a special version of RoomAlive they created for Halloween called The Other Resident. Just to prove I was actually there, here are some pictures of the lab along with the Kinect MVPs amazed that we were being allowed to film everything given that most of the MVP Summit involves NDA content we are not allowed to repeat or comment on.

WP_20141105_004

WP_20141105_016

WP_20141105_013

 

IllumiRoom is a precursor to the more recent RoomAlive project. The basic concept is to extend the visual experience on the gaming display or television with extended content that responds dynamically to what is seen onscreen. If you think it looks cool in the video, please know that it is even cooler in person. And if you like it and want it in your living room, then comment on this thread or on the youtube video itself to let them know it is definitely an M viable product for the XBox One, as the big catz say.

The RoomAlive experience is the crown jewel at the moment, however. RoomAlive uses multiple projectors and Kinect sensors to scan a room and then use it as a projection surface for interactive, procedural games: in other words, augmented reality.

A fascinating aspect of the RoomAlive experience is how it handles appearance preserving point-of-view dependent visualizations: the way objects need to be distorted in order to appear correct to the observer. In the Halloween experience at the top, you’ll notice that the animation of the old crone looks like it is positioned in front of the chair she is sitting on even the the projection surface is actually partially extended in front of the chair back and at the same time extended several feet behind the chair back for the shoulders and head.  In the RoomAlive video just above you’ll see the view dependent visualization distortion occurring with the running soldier changing planes at about 2:32”.

 

You would think that these appearance preserving PDV techniques will fall apart anytime you have more than one person in the room. To address this problem, Hrvoje and Andy worked on another project that plays with perception and physical interactions to integrate two overlapping experiences in a Wizard Battle scenario called Mano-a-Mano or, more technically, Dyadic Projected Spatial Augmented Reality. The globe at visualization at 2:46” is particularly impressive.

My head is actually still spinning following these demos and I’m still in a bit of a fugue state. I’ve had the opportunity to see lots of cool 3D modeling, scanning, virtual experiences, and augmented reality experiences over the past several years and felt like I was on top of it, but what MSR is doing took me by surprise, especially when it was laid out sequentially as it was for us. A tenth of the work they have been doing over the past two years could easily be the seed of an idea for any number of tech startups.

In the middle of the demos, I leaned over to one of the other MVPs and whispered in his ear that I felt like Steve Jobs at Xerox PARC seeing the graphical user interface and mouse for the first time. He just stroked his beard and nodded. It was a magic moment.