Object Detection using Microsoft's HoloLens

Animation for Computer Games (COMP 477) - Final Project

Introduction

The primary objective of this project was aimed towards detecting a real world object in the user’s surroundings and project a holographic marker demonstrating the object has been detected using the Microsoft’s HoloLens.

HoloLens

Microsoft HoloLens

A key feature of the HoloLens is its native depth and surface detection functionality. This provided us with the tools necessary to place objects in the real world and have the HoloLens map these into its memory for interactions to be applicable in their holographic form. The Vuforia API provided us with the ability to model holographic shapes to create a live interaction with the user, while also enabling basic support for recognising objects in a 3D environment. Given these abilities, we were able to continuously poll the system for target objects meant to be detected in the real world. Once detected, an object would be marked to the user with a holographic indicator providing a visual cue for the user. In short, this application allows the user to interact with the real world in augmented reality.

Microsoft and Unity have collaborated to provide developers of Mixed Reality applications the tools to efficiently design and create for the HoloLens.[1] Given this compatibility and the lack of other options, it was a necessary decision to use Unity to develop our Augmented Reality application.

Objectives

The opportunity and challenge to discover how to use new technologies like the HoloLens is what motivated the team to choose this idea for the project. The HoloLens is paving the way for mixed reality applications and the team was excited to have the opportunity to design an application that showcases its functionalities.

Since none of the team members had any experience or knowledge on the interaction between the Unity engine and Microsoft’s HoloLens, they agreed that taking on this challenge would be a very interesting and educational experience. By creating an interactive HoloLens application that integrates the real world and the virtual world, they hope to gather knowledge and experience that will make them better programmers and software engineers.


  1. Configure and calibrate HoloLens controls
  2. Install and configure HoloLens-Unity development tools
  3. Research the capabilities regarding the native HoloLens’ depth perception functionality and surface detection functionality
  4. Research HoloLens functionality to track points in 3D world space
  5. Research object detection functionality
  6. Integrate library & assets (to complement native HoloLens functionality)
  7. Implement basic skeleton HoloLens application (complete basic pipeline)
  8. Implement object detection and hologram placement with HoloLens
  9. Create application that tracks object in 3D world space with HoloLens (and dynamically mark it with a holographic indicator)
  10. Research how to take a playthrough video of what a HoloLens user is experiencing.

Report

The first step in the project was getting familiar with the HoloLens. This was a significant challenge given the very recent launch of the HoloLens in 2016, and the recent update of Windows 10 to include Windows Holographic, in 2017 [3]. Given the continuous development of the API and frequent updates to the system, configurations and calibrations of the controls was a difficult and continuous task. With the accessibility difficulties due to the restriction of one device per team, there was the challenge of researching a way to emulate the HoloLens on each of the team members’ computers. This was of the utmost importance given the time restrictions and the constant need for testing given the device’s frequent updates.


HoloLens-emulator

HoloLens Emulator [4]


The team opted to use a recently built emulator of the HoloLens to install and configure on each of the workplaces. Another challenge was the strong dependence the HoloLens had with the Windows 10 operating system. This was a factor that was important to resolve, thus the entire team had to upgrade their working computers to the latest version of Windows. This included having a lab computer at school set up in order to meet and test, and later demo, the application. As this was a school computer lab, there were the challenges associated with bureaucracy, as well as limitations associated with a large public network. One notable example is the fact that having a device in developer mode connect directly to the Concordia network isn’t really possible, as it is not considered a secure device [4]. Thus, it was necessary to pair the HoloLens to a computer on a private network, such that applications could be deployed directly to the device through a device portal [7].

The installation and configuration of the HoloLens-Unity development tools allowed streamlining of the collaborative work. This allowed a medium where the entire team would be able to collaborate on the project simultaneously and remotely. The primary difficulty was the numerous configurations necessary to have a working interface that emulated the HoloLens functionality. Given the team’s inexperience with the Unity Development Software, and the various configurations settings necessary, having a working setup that was able to function on the HoloLens was a daunting task. This required familiarity with the terms and notions packaged with the development process of the device. It should be noted that while the emulator was invaluable as it made it possible to work without the HoloLens itself, it was also in itself a source of new difficulties. For example, the deployment process for the HoloLens [5] was not identical to the deployment process of the emulator [6], thus both deployment processes had to be implemented for the development pipeline to function.


device-portal

HoloLens Device Portal [7]


Once this initial barrier had been surpassed, the next task was to research how to access the HoloLens’ native depth perception functionality and surface detection. Although the documentation was clear, given a recent update, the documentation was outdated and merely provided the team with a direction and with little else to work with. This meant many hours of searching on forums for answers given the subtleties of the system. Thankfully, the community was very helpful. The first test was to follow a tutorial provided by Microsoft that allowed the user to interact with a series of holograms and place them on real world surfaces by interacting with the virtual mesh mapped to the surrounding. One noticeable problem was the technical limitations of the HoloLens at capturing the proper detail in unfavorable settings with dark surfaces in less than ideal lighting. These hardware limitations made it apparent that the team would have to approach their goals with a different outlook on what would be achievable.

The next step involved researching the ability to track, in real time, a basic 2D image using the HoloLens. This involved identifying feature points, unique features that each image has, that typically consist of high-contrast edges. It is important that these feature points are not impacted significantly while looking at them from different angles. After research in storing images to be identified in the world on the HoloLens, the team managed to scan some images and store their feature points and their information in a database in order to make them accessible for the recognition of the targets. This then provided the ability to identify the target images in the world.


feature-points

Feature Points (Photo Credit: Anastasiia Bobeshko)


Using DroidCam [8], an Android smartphone application that emulates a webcam, and the HoloLens Emulator, the team was able to emulate the HoloLens’ camera functionality to capture the live feed of the application set up to detect using feature detection points. It was especially useful for the team to work on the project without direct access to the HoloLens and was an important factor in speeding up the development and testing processes. Several regression tests were formulated with the lab functioning on the most recent stable version as a fallback to debug odd behaviours that were not always easy to reproduce. After much testing using Droidcam to emulate the HoloLens the team came to hypothesize that there was a limited distance at which objects are well and consistently recognized. After several tests on all 4 smartphone devices and the HoloLens itself, this became confirmed the hypothesis. This is likely a direct result of lighting within the room and issues with the resolution of the HoloLens’ camera.

The final stages consisted of transferring the knowledge gained in detecting a 2D image and applying it in order to detect 3D objects in the world. This would allow for more interesting interactions with the HoloLens and the real world. This meant the team would need to find a way to scan a 3D object and store its feature points. After several hours of trial and error, the team was able to capture the necessary still images of the targets in order to identify the appropriate feature points and store the related data into a database for later use in the recognition of those targets. Using the many distinct features identified with those pictures, an adequate mapping of those points could be produced, which gave a means to detect the object in real time.

Showcase

The team opted to use Block Tech, a Lego clone, to develop object targets with robust features. The capacity to rapidly develop a new object target and test how well the application recognized the target was immensely useful. After several hours of researching and developing different object targets and testing, the team found that objects with continuous colors or repeating patterns did not provide good test cases due to their feature points resembling each other or not having enough edge cases to build upon. Thus, although the esthetics of the object targets were initially a priority, it clashed with the need of features points. In the end, the objects targets with the most robust feature points were selected, as seen in the image below:


targets

Targets: Tank (left) and Castle (right)


As a stretch goal, the team decided to create some form of interaction with the user and the 3D object in the real world. This was done by having a holographic indicator appear over the object highlighting it and allowing the user to gain access to abilities to play out a scenario of a tank laying siege on a castle. Making use of some Unity tutorials, the team was able to make use of some very rudimentary physics to simulate an instantiated projectile accelerating toward a target and collide with it for further interaction. The castle’s position can be tracked in the 3D world using the HoloLens native abilities and the tank can rotate and fire towards it using basic transformations. Finally, basic holographic indicators above the two objects that would identify their health allowing for a winning scenario to the user were implemented.


Castle

Video recorded through the HoloLens Mixed Reality Device Portal, featuring the castle rotating and the HoloLens tracking its feature points in real time.


Tank

Video recorded through the HoloLens Mixed Reality Device Portal, featuring the tank rotating and the HoloLens tracking its feature points in real time.


Castle Shooting Tank

Video recorded through the HoloLens Mixed Reality Device Portal, featuring the castle shooting at the tank and the tank's health depleting to 0.


Gameplay Demo

Video recorded through the HoloLens Mixed Reality Device Portal, featuring the tank shooting back at the castle on user voice command "Fire", the castle's health depleting to 0 and the user resetting the scene using voice command "Reset Scene".

Team

Members Roles
Jhayzle Ruth Arevalo Web Development
Marc-Antoine Jetté-Léger Application Development
Jordan Sen Research & Development
Jennifer Sunahara Hardware Configurator