Understanding and assessing AR headsets

10 min readJan 28, 2020

A couple of weeks ago I wrote an article to help businesses and organizations understand the terminology and challenges when creating digital product configurators. It was well received and readers started reaching out to discuss other digital trends. A lot of questions were around assessing the potential of new Augmented Reality technologies, specifically headsets. This was probably driven by a huge number of announcements during CES2020.

So let’s dive into AR a little bit to understand the terminology and challenges around the topic, with the goal to better assess the potential of individual solutions.

First, the usual disclaimer: While the views in this article are mine, please be aware that I work for Microsoft around IoT and Mixed Reality things.

What are we talking about?

As the field of Augmented Reality is very large, I want to first set some boundaries for this article to keep it manageable. When I use the term “Augmented Reality” here I mean that a user is provided computer generated information alongside their perception of reality. In other words: “I am experiencing the world and I not only sense my environment, but also additional computer generated information related to that environment”.

Humans experience the world via their senses and in principle every one of those can be digitally augmented: Screens & glasses augment the visual sense, headphones the audible sense, gloves the haptic sense and so on. While every type of augmentation is interesting, I want to focus here on the visual (and to an extend the audible) sense, specifically using Augmented Reality headsets. I will not talk about smartphone based AR (not a headset) or Virtual Reality (not AR).

Understanding device classes and modes

One challenge when talking about Augmented Reality is that devices can have very different approaches to showing data to a user. These can be roughly aggregated into 3 categories, or modes: Annotation, unification and replacement.

Currently these approaches manifest themselves as device classes, as they have to deal with different technical challenges (see below). But really these are operational modes that achieve different things in different situations and scenarios. So it’s important to note that no device class / mode is strictly better than the other, they just have different optimal use cases.

I believe that in the future these device classes will converge and one device will be able to utilize every mode, based on which provides the best user experience in a given situation. However until the technology has caught up, they will remain separated.

Mode 1: Second screens aka Heads-up-Displays aka Annotated Reality

The simplest mode of AR is showing information alongside physical reality. A simple way to think about this is a smartphone screen permanently located in the viewport of the user.

While the visual output is separated from the physical environment, the key is some form of contextual awareness. A system used for navigation might utilize GPS and a compass to know where it is and provide the relevant information about the physical environment on the screen (“You are here and this is what’s around you”). That way, although the screen is separated from the environment, there is a contextual connection between both.

Get Directions [through Google Glass], Google

Comparable to: Smartphone in your vision, HUD in a car or an airplane.
Examples: Google Glass, Epson MOVERIO, Vusix Blade & M

Mode 2: Unified worlds aka Digital Windows aka Mixed Reality

Mixed Reality integrates digital aspects seamlessly into the physical environment. Instead of “screens in the air” the digital elements are in and around physical objects. When done well the user is not put off by the integration of digital and physical elements and both feel equally natural and intuitive.

Such a system is also context aware, but beyond that it also has spatial awareness. For example a navigation system would use the environmental understanding to highlight the actual lane you need to take and possibly other relevant information.

AR hub, Ceres Holographics, https://www.ceresholographics.com/ar-hud.html

Comparable to: Iron Man design scenes
Examples: Microsoft HoloLens, Magic Leap One

Mode 3: World replacement aka Augmented Virtuality

This seems similar to mode 2, however it works the other way around: It integrates the physical world seamlessly into an artificial digital environment. In other words, the user is not able to see the outside world, however the physical environment is recreated in a context specific way, matching the physical reality if required.

holoride, https://www.holoride.com/

Comparable to: Audi VR experience, Valerian “Big Market” scene
Examples: Varjo XR-1, HTV Vive Pro

Understanding the challenges for AR devices

Let’s start with hardware considerations. From a technical point of view AR hardware shares the usual challenges of mobile devices: performance, comfort and convenience. The more performance a device has, the more energy it requires and the more heat the components produce. Thermal management makes the device bigger and bulkier (heat sinks and active cooling) while more batteries make a device heavier. However compromising on any of those aspects leads to a sub-par device:

Compromising on thermals makes devices warmer. While fine for laptops, you do not want to wear something as hot as your laptop directly on your head.
Compromising on batteries makes the devices inconvenient to use when the operating time gets shorter than the desired usage scenario.
As does compromising on performance when the device gets too sluggish or unresponsive to use.

Unfortunately the technology used in AR headsets is so immature that EVERYTHING has to be a compromise. There is literally no option to create an “uncompromising” device, no matter how much resources a manufacturer has, as the technology is simply not there yet.

Form factor considerations

One way around this is to externalize the problem: Put some of the components in an external enclosure or rely on another external system. Looking at the two competitors in the Mixed Reality category, Magic Leap decided to put all processing power into an external device called“Lightpack” that is connected to the headset (“Lightwear”) via cable while Microsoft decided to go for a fully self-contained device.

Left: Magic Leap 1 — Right: Microsoft HoloLens 2

This way Magic Leap can use more powerful processors and bigger batteries as well as generate more heat as weight and heat are on your hip, not on your face. The drawback is that the user needs to be more mindful when handling the device (there are dangling cables involved) and the overall weight is higher (headset + external processing pack). Overall this is not a bad solution, as long as these drawbacks are acceptable in your usage scenarios.

The form factor can even be configurable, where third party manufacturers can take a device and change the shell to their requirements. An example would be the HoloLens 2 Customization Program, which allows partners to customize a HoloLens 2 device. Further abstracted are hardware platforms, where the solution might not be a device, but rather base technologies that can be used by device builders to create their own solutions quickly. An example would be the Qualcomm Snapdragon XR2 platform.

Software platform considerations

There is also the software side. Assuming a manufacturer has created a device, what software platform are they using? None of the established mobile operating systems (iOS, Android) are really optimized for AR headsets. Sure, they can run AR applications and offer respective SDKs on smartphones, however that is very different than running a dedicated headset — especially for mode 2 and 3. Android (ASOP) could be used as a basic platform for example, but it still requires a custom spatial UI, dedicated interfaces (Hand tracking? Gestures? Voice?), an attractive development environment suited for spatial applications, security, remote management features and so on. None of those things are mature for spatial use cases yet. In other words: It’s neither easy for developers nor for customers using a spatial device.

Audience considerations

That brings us to the purpose of the headset: Does it have a clearly defined use case and target audience or is it supposed to be general purpose?

Vusix smart glasses for example are clearly targeted at enterprise scenarios. Their entire website is about field service, remote assist, manufacturing or warehouse solutions. It’s a professional tool you use at work with no ambition to cater to mass market consumers or even early adopters.

Microsoft HoloLens 2 does the same thing. Microsoft calls it “The future of work”. While the branding around it is more fashionable and almost like a consumer device, the targeting is clearly enterprise and business.

Magic Leap One always had a more general purpose messaging. The product page itself is very modern and consumer oriented, the main website starts with “Magic Leap 1 for enterprise” before showing the “Magic Leap 1 at home”.

A clear target audience is important because with the hardware limitations as they are, it’s easier to develop devices around a clear use case. It’s also relevant as some use cases demand specific device considerations — for example safety certifications in certain enterprise environments like manufacturing or construction. It’s important not to evaluate every device based on their “mass market appeal”, but according to the specific use case and scenario.

Defining success

Defining success in the AR space can be hard. There are very successful companies that focus on specific use cases, scenarios or verticals, that are very profitable and add tremendous value for a narrow niche of customers. This is pretty much everybody in the enterprise space. Remember that due to technology limitation every device has to be a compromise at the moment. It’s way easier to design a device for a very specific use case where you can build around the challenges, but also some compromises are more accepted — enterprise hardware usually being heavier and bulkier as they have to meet safety regulations.

With general purpose consumer devices, every compromise is turning off yet another potential buyer group.

To even get to a general purpose & consumer grade device, it has to be:

1.500 USD or less, comparing it to super high end smartphone devices.
Ability to produce up to 15 million units per year, similar to the iPhone X in it’s first year as a high cost, high end smartphone.
Around 6–8h of battery life to use it a full day, assuming it’s worn and used during commute, work and for relaxation.
200 grams / 7 oz or less, making them way heavier than regular glasses (around 20 grams / 1 oz), but lighter than a bike helmet. This is already stretching it for full day use, but acceptable if the ergonomics are done right.
They must be inclusive, for all kinds of head shapes, hairstyles and personal accessories, as well as for users with impaired vision or hearing.
Rugged enough to survive a fall or two plus the usual mishandling by friends, kids, pets and tossing it into a gym bag.
Active developer community, including available variants of the most used smartphone apps, assuming that the goal of general purpose, consumer grade is to replace the smartphone for most use cases.

That’s kind of the point, isn’t it? The endgame is to build the next smartphone scale device and the only chance of doing so is to replace it. To make a bold prediction, I believe that a true AR consumer device has to be mode 2. Mode 3 won’t be socially acceptable for a while (as it completely replaces vision) and mode 1 is just too similar to a smartphone to warrant a large scale switch to a new form factor.

In my view, the above is the minimum it would take to even have a chance at creating a large scale consumer AR product. But feel free to disagree and substitute your own criteria and thresholds for viable AR consumer devices.

Now, to be fair there is an interesting alternative where AR headsets will start as mode 1, augmenting the smartphone. In that case the smartphone becomes an external component of the glasses — a companion device. That was actually the original idea behind Google Glass. This strategy makes perfect sense if you’re an incumbent smartphone manufacturer: Start with a companion device featuring basic functionalities (similar to a smartwatch) that over time grows into a stand-alone product. This would not cannibalize their earnings and they can build everything on their existing platform, gradually extending it. The bad news is that you’d need to be Apple, Google, Samsung or maybe Huawei to pull this one off.

Again, if your use case is not “replacing the smartphone”, but a very defined vertical, success might be selling 10.000 units plus long term service and operations contracts. These kind of devices might look strange at first as they are usually highly optimized for that specific use case. Lugging a 3kg / 6 pounds device around might seem absurd until you are a fire fighter and the thing need to be able to take a beating and still work reliably in 300 °C environments.

Is it real?

This is probably the hardest one to answer. There are a lot of “future vision” videos and press releases that paint a very… optimistic picture of their solution. A prime example of this would be the original “one day” vision video by Google to promote Google Glass.

Google Glass Project — One day, Google, 2014

We learned very quickly that the actual Google Glass device is nowhere near that vision. Indeed ( would argue that neither hardware nor software of Glass did anything to advance Google towards the vision showed in the video. That said, I do believe it wasn’t Googles intent to mislead customers, developers or investors. There is a need to create such visions to make goals and ambitions of your solution tangible. However it is sometimes hard to see behind the curtain and understand what is real (existing product) and what is not (vision). If in doubt, get a hands-on demo. And make sure it’s not a prototype or engineering sample, but the actual production model.

Summary: Understanding device potential

The above gives us a good framework for understanding and assessing the potential of specific AR headsets:

What mode is it? Is it a second screen, unified world or world replacement?
Is it self-contained or does it rely on other, external components? Is it customizable or a platform?
What software platform does it use? A new, self-made one, an adapted version of something existing or mature, common platform?
Does it have a specific use case or is it general purpose? Likewise: Does it have a specific audience or is it fully consumer grade? Does it pass your threshold for viable consumer devices or the regulatory requirements for an enterprise device?
And the most important question: Does it actually exist?

Now, what are you interested in? Do you want to assess a device based on its potential to be a consumer grade, multi-million unit seller? Or how well it fits a specific enterprise use case? Analyze which manufacturer has the most forward looking vision regardless of their ability to execute? Based on that, define what answers you would prefer to see for the respective categories and then assess the device(s).

While not perfect, this should offer a robust model that allows you to assess the potential of different AR headsets and even compare devices and solutions.

Feel free to check, assess and discuss the following solutions presented at CES2020 to get a feel for it.