The rationale for and the interplay between strategic digital trends
For a while now I have been talking about strategic digital trends, specifically about their impact and how they influence businesses and organizations. Strategic trends are long-term, society scale trends that will affect individuals, groups as well as businesses. In regards to digital, these strategic trends are commonly described loosely under the term “Digital Transformation”.
When looking at these trends, people tend to look at them in a vacuum. In my view it is also important to understand how individual trends interact with each other and how they change the landscape as they scale throughout society. For that it is also relevant to understand the key drivers why these trends happening in the first place and what might accelerate their adoption.
So here is a collection of the most known / relevant strategic digital trends and how I view them. This is not meant to go into detail (it’s long enough already), but to explain what they are, what they enable and how they interact with each other. Read it more as a cheat-sheet than as an article. If there is interest in a specific trend, I might write a detailed follow up — please feel free to reach out.
The Internet of Things
The Internet of things as a concept means adding network connectivity to physical objects. This might be through a mobile network like GSM up to 5G, through a local cable based or wireless network or the object could be paired to another device that would provide network connectivity.
The digitally augmented object is able to send data, usually telemetry data that it collects through a number of sensors. One typical use case is to send telemetry data about the current state of the object and its components, for example the number of rotations inside an engine or the current brightness of a dimmable LED light. Another use case is to send data about the environment around the object, for example the outside temperature, ambient light or humidity.
The object is also able to receive network data, usually to allow to remote control its features and functionalities. A well know example of this would be a smart light bulb that can be controlled via a smartphone or tablet.
Note that this concept is not new. Medical equipment has been digitally augmented for decades with vital sign monitors being able to share their data across devices and alerting doctors in case of medical emergencies. More commonly, heating systems in houses and office buildings have been connected for over 20 years as they started collecting data from sensors to regulate the temperature across the property. Of course those kind of systems started out rather large and expensive. What’s new is that smaller and cheaper objects are now connected in the same way.
Key drivers for IoT
While connected devices enable convenient and safe scenarios, the Internet of things is driven as much by economics as it is by convenience. In fact, it is now cheaper to integrate more technology into objects than integrating less. The rule of scale dictates that the more of an item is produced, the cheaper it will get. To achieve maximum scale, technology components are used not only in one product category, but are shared between them. That’s why technology giants like Samsung offer such a wide array of devices, from smartphones to TVs, fridges or other household appliances . If a component becomes obsolete in one product category, it will be replaced in all categories to again achieve maximum scalability.
Recently a product category spawned that drove miniaturization and integration of capabilities to the next level: Smartphones. As smartphones became ubiquitous, they gained more abilities: touchscreens, cameras, fingerprint sensors, Wi-Fi and Bluetooth as well as increasingly faster mobile network connectivity. The technology also got smaller and lighter with lower power consumption. Finally they spawned mature operating systems, distribution platforms and developer ecosystems.
If a manufacturer wants to develop any kind of physical product that is somehow electrical, they will most likely look into smartphone technology as existing off-the-shelf components. Let’s look at printers for example. A cheap, modern printer will feature a color control screen, probably even a touch screen. It will print, fax and copy, be accessible via network plus USB and have an army of other neat little features. Instead of developing and testing new hardware components, developing or adapting an operating system and app environments, manufacturers would buy smartphone components and get most of the functionality “for free” — a smartphone is already able to do everything that a printer would need to do.
Smartphone components are already produced to scale, readily available for low prices and a mature supply chain. As a result they are used everywhere electrical, and thus fridges and ovens are now able to surf the web and send us notification to our phones.
At this point it could be argued that nobody needs a connected fridge, connected oven, connected chair, connected coffee mug or connected toothbrush. But this is besides the point — not embedding digital abilities into objects will be more expensive than not doing so. Non-digital objects will be pushed into the realm of focused professional niches and artisan products. Thus the assumption is: Everything that can be digital will be digital. Everything that is digital will be more digital. This will continue to the point where every object is digitally connected.
Machine Learning & Cognitive Services
Cognitive services are a specific subclass of machine learning algorithms to provide intelligence-like capabilities for sensors. For example they can take microphone input and use voice recognition to determine if a user is talking and to possibly identify the user. They provide speech analysis to turn the voice recording into text and then language understanding to extract the topics and intents behind the words. Cognitive services also provide object recognition for still pictures and video. They are able to identify objects, animals and people as well as their poses, activities and sentiments. They also analyze any kind of written or signage content. In other words: They are a collection of services to turn raw input from different kinds of sensors into actionable information.
Cognitive services are usually offered as a cloud-based family of services. Manufacturers and service providers can utilize these services and integrate them into their products. As a result customers can for example use natural language to talk to their smart lights, music systems, TVs or game consoles, which in turn use cognitive services to analyze these requests and turn them into actionable commands that they can react to. Security cameras use cognitive services to differentiate between the pet running around in the house and a stranger.
The assumption is: Everything digital and connected will be able to utilize cognitive services to extend their abilities and understand what we say and do as well as the immediate environment.
Key drivers for Cognitive Services
One goal of the Internet of things is to reshape the way humans interact with their physical environment. It might be economical that watches, fridges, ovens, tooth brushes or light switches evolve into digitally enhanced things with smartphone-like capabilities, but from a user experience point of view it’s just not practical. Most objects have very reduced and focused user interfaces that have been optimized for their original purpose and will not be able to control the full set of digitally augmented abilities. For example a simple on/off light switch will not necessarily be able to reflect the full capabilities of modern light bulbs, which are also able to regulate their brightness or even color.
Some objects might have enough surface area that can act as a digital screen, for example walls, tables and windows. Essentially this would turn the light switch into a smartphone. While this might work in some cases, not every object has the space for an additional screen and it’s just not practical to make every object more complex.
One way to solve this is to offload additional interfaces to a remote control. This might be a dedicated remote or a form of second screen, usually a smartphone or some form of control hub. While this will work, it is very tedious for users to pull out their smartphone, open the respective app and press a button just to turn on their oven. Users want direct control of the device they are using, even in their digitally enhanced state.
The goal of cognitive services is to create more relatable interfaces by using natural ways of interacting with objects that otherwise have no obvious interface. They turn connected objects into smart devices by providing the “smarts” in the first place and allowing new ways to control digital aspects and abilities of objects.
The intelligent edge as a concept allows digitally augmented objects to largely operate without being reliant on the cloud. Instead, sensory data can be analyzed and processed directly on the device itself. That said, edge devices are not strictly offline — they have a symbiotic relationship with the cloud. Objects would interact with cloud services whenever possible and necessary, while not being dependent on them. They would operate “on the edge of the cloud”.
In practice that means that edge devices can for example directly run the machine learning models created by cognitive services. This increases the speed by which the devices can react and allows them to operate in environments with bad or no internet connection.
While the devices are able to operate independently, they are still bound to a central service. If for example the manufacturer or service provider issues an update of the machine learning models in the cloud, the devices would update their local versions whenever possible / as soon as they have network bandwidth available. Likewise the devices would send their telemetry data they collected while being offline.
Key drivers for Edge Computing
As with IoT, the driver for edge devices is the rule of scale, combined with Moore’s law. Gordon Moore, CEO of Intel between 1975–1987, observed that the number of transistors in a dense integrated circuit doubles about every two years. Simplified it means that processor speeds and overall processing power for computers roughly doubles every two years. It also means that the required physical size roughly halves for a given performance load. As a result more computing power is automatically pushed to the edge.
As an analogy: In the 60s and 70s the size and cost of computers were very prohibitive. As a result users used computer terminals as a cheap interface to access computing power. Instead of being a computer itself, a terminal would only provide input and output capabilities (usually a keyboard and monitor) while the actual computation would be done on a “real” mainframe computer elsewhere. Over time terminals became powerful enough to eventually replace mainframes for almost every use case.
Currently most connected devices are similar to terminals as their target cost and size will prevent that the desired (and usually very computing intensive) machine learning models can be run locally. As time progresses, cost and size barriers will inevitably fall and the rule of scale will dictate that newer hardware will be able to run previously impossible work loads.
Bots as Conversational Interfaces
Bots allow digital and connected objects to engage in natural language conversations. They are built upon cognitive services, which turn the user request into intents and sentiments. The bot would then search an internal database of topics and processes to select the most relevant response. Cognitive services are then leveraged again to turn the result into a natural language response.
A simple comparison is a dynamic decision tree that users navigate using natural language, for example via text messages or voice. This is a very efficient way to handle common or predictable transactions or information requests.
Let’s look at a really simple example conversation between a user and a bot that represents a hotel chain:
The conversation above is comparable to what can be expected from a phone based decision tree, however taking a couple of shortcuts. The first sentence already contains what the user wants to do (“Book a room”), the location (“Paris”) and the date (“27. — 29. of October”), so the bot can skip asking for these individually is able to continue with presenting specific options right away.
Thus bots are very good at handling transactions, specifically one-time transactions with optional persistence between them. So when ordering another hotel room using the same bot, it might remember the users preferences and act accordingly.
Bots are also good at negotiating group consensus. Imagine the above example, but in a group chat where three friends would discuss the best hotel options, moderated by the bot. The complexity of a bot can be scaled easily, however their purpose usually stays very narrow. In a way they are very much like compartmentalized, specialized smartphone or web applications.
Key drivers for Bots
Most interactions between users and businesses are based on defined processes: Booking a hotel or means of transportation, buying goods, negotiating appointments, even answering service requests. All of those cases are usually modeled as transactions on websites or apps, even offline in print forms or automated phone systems. In those cases all required information is separated into required and optional data points and have to be entered individually by users according to the defined process.
However in conversations we are used to cluster multiple data points together, as seen above. This is not only more efficient, but also feels more relatable and pleasant to use as it mimics the normal “mode of operation” for (enabled) humans — it’s “how we talk”. So businesses are not only interested in efficiency (handling more requests), but also in strengthening the relationships with their customers by making their brand more approachable and pleasant to interact with.
One aspect of an IoT device is to collect data about itself and about the context it is operating in. A sensor that measures how many people walk in and out of a store measures something external while other sensors measure internal data points, for example how long they have been operating, their internal temperature and other telemetry data about their operability. In effect they create a digital likeness of themselves, describing their current state. Paired with enough knowledge about the static properties of the object (size, weight and so on), this creates an accurate digital replica, directly linked to the objects’ physical form. This concept is called a digital twin.
As the digital twin is the aggregation of sensory data, changes in the physical object or its environment immediately affects the digital representation. Likewise, as digitally augmented objects accept remote commands to trigger their physical functions, changes in the digital representation affect the physical object as well.
The collected data can be stored and analyzed later. This allows a deeper look into how individual objects behave over time: “How did an individual machine behave at peak load times vs. idle times?” “Did the performance or operational capabilities degrade over time?” This also allows looking at multiple objects in a physical environment: “Are the performance metrics of one machine comparable with others in the same location?” “How are machines in one factory / location compare to the same / similar machines in another?” This allows to differentiate between correlation and causation in an environment by looking at all the data points and understanding the interdependencies.
Key drivers for Digital Twins
Digital twins are a direct result of adding central management systems to IoT and edge devices. It would indeed be more effort to prevent the creation of digital twins than to utilize them. They provide huge returns on investment almost without additional costs and thus supercharge the adoption of IoT and edge computing.
One important scenario that digital twins enable is predictive maintenance. If all objects of a specific type are connected and transmit their operational telemetry data, they create a database of behaviors — their usage patterns, performance, maintenance cycles and outages. At some point this database has enough data points to be able to make statistically significant predictions: “Other machines of the same type with the same usage pattern required maintenance about now.” Systems like this can also calculate the likelihood for incidents like material fatigue or other potential outages in industrial environments, which is even more valuable.
Digital replicas can also be used for simulation purposes, often to understand behavior in complex systems. Such simulations contain a number of digital twins, representing real objects, which are placed into a simulated scenario like a city, a factory or a private home. Because digital twins are for all intents and purposes real “actors” within the scenario, they are able to realistically simulate behavior, especially interdependencies between individual objects. The automotive industry uses simulations like this to train and test models for autonomous driving. They place a vehicle consisting of a number of digital twins representing the IoT / edge devices responsible for its behavior in different scenarios and environments. Additional twins of objects (like road signs or traffic lights) or agents (people and animals) would populate these simulations and provide the models things to react to.
In industrial settings, simulations like this are used to test malfunctions and how redundancy systems would behave, or in training to let participants train on digital representations that behave like the real objects would, but without potential real world risks.
Computational graphs provide context how objects, people and environments relate to each other. In a way, they are digital twins for everything that is not an object: Processes, systems and interactions. They provide the glue between objects and the implicit and explicit rules in which they operate.
Computational graphs are also not new. The semantic web has tried to bring together information and how it relates to each other since the beginning of the world wide web. Especially with the rise of social platforms, computational graphs have been used to map relationships between people and groups.
In terms of things, computational graphs model how individual objects relate to each other. This might be a connection (“Object A is physically connected to object B in this manner”), how they work together (“Object A, B, C and D together create machine X”) and how they depend on each other (“Object A needs object B to operate”).
With people, graphs provide information about the state of a person (“busy”, “bored”, “talking”, “happy”), their contacts and how they relate to each other (“brother”, “son”, “friend”, “colleague”), the messages they exchanged and their current sentiment, the documents they have been working on alone or together, the objects they used or have access to, the places they are, have been or will be.
In terms of environments, graphs group together places spatially (This is a country, a city, a district, block, house, room), but also logically (this is mine, ours, what I want to visit).
Ultimately, the purpose of computational graphs is to provide access to deep insights, generated from usage patterns. Similar to digital twins, this will allow smart systems to find similarities and proactively generate rich, personalized scenarios based on the collected data.
Key drivers for Computational Graphs
Even if things become smart and people interact with them in a natural, intuitive way, they would still not be personal per se. They would have generic abilities and allow scenarios that would fit the average user, but not necessarily tailored to specific individuals. There is a need to add a personalization layer to digital dimensions that knows the user, its context as well as specific needs and relationships with their environments.
Let’s consider such a personalized scenario:
As the user gets up in the morning, their fitness band and bed would be aware that they have not slept well. Not only did they only sleep for 5 hours in total, but the sleep quality was very low as well.
As they brush their teeth, the smart toothbrush would pick up indicators that they might be on the verge of getting a cold.
As they get to their car to get to work, the calendar reminds them that they have a very important meeting in 25 minutes.
Based on this data, the car would conclude that the user might feel stressed and exhausted and automatically choose a more comfortable seating position and select the music accordingly.
This scenario is based on multiple specialized objects providing information, which other objects can then use to make decisions. Effectively multiple smart things can then work together semi-autonomously to deliver complex scenarios, turning smart (but individual) things into smart environments. But there is a catch: There needs to be a trusted platform that keeps track of the user, its context and relationships with the environment. It would contain their contacts, the messages they exchanged, the documents they have been working on alone or together, the objects they used or have access to, the places they are, have been or will be. A platform that all things could send information to and some trusted things can pull information from. This is the core need for computational graphs.
Digital Assistants as Personal Interfaces
The best way to describe a digital assistant is: “A service that is aware of your context at all times and brokers communication between the user and digital services”. Essentially this takes a bot and adds more contextual awareness to it by using computational graphs. The result is a service that can abstract and automate interactions between the user and other less personalized digital services.
Revisiting the earlier bot example, the digital assistant would be aware of the users’ calendar appointment in Paris and then offer to contact the bot of a preferred hotel chain to see if they have a room available. It would hand over the relevant information to the bot (action, location and dates), pre-populating the bots decision tree with as much information as possible. Ideally, all the user would need to do is acknowledge that everything is correct and confirm the transaction.
The true value of digital assistants stems from discreet but expensive interactions — seemingly simple inquiries that are comprised of a number of steps which might involve extensive knowledge or number crunching. Let’s take a look at the following question.
“Hey [assistant], where can we go for something nice to eat?”
To begin answering the question the system first needs to unpack a number of things. Who is included in the relevant group of “we”? Then the system would have to gather everybody’s likes, dislikes and potential no-go’s in terms of food. Finally the system would have to understand what everybody’s understanding of “nice” is. If such information is not available, the system might reach out to the user’s or their digital assistants to get it. With that it would have to conduct a search of available options and make a choice based on the requirements. Each individual step is not necessarily hard, but the combination might be time consuming and involve a lot of communication.
So digital assistants are about handling chains of transactions, based on deep knowledge of the user’s context, which evolves and refines over time. They might learn general preferences and habits. Their purpose is general and all-encompassing, so in a way they are a monolithic application, however one that can command and control the existing army of specialized bots and services and thus has access to their capabilities.
In a way, bots are a conversational API and digital assistants are the applications consuming them.
Note that neither bots nor digital assistants are artificial intelligences. Understanding the intent or context behind a natural language request does not equal intelligence. The “intelligence” of a bot is determined by its capability to take the recognized intent of a user and provide relevant insights accordingly. That could mean answering a question correctly, adding new arguments to a discussion or generally carrying the conversation onwards in a meaningful way.
Key drivers for Digital Assistants
Ultimately the digital assistants become brokers for private information, which makes trust the most important factor. This might disqualify a lot of the service providers offering bots and enable other players that are focused on handling personal data and trust to establish themselves as central instance handling your data.
Currently every app, platform and service is collecting information about their users and is using the data as it sees fit. User location is usually known to Google, Facebook, Apple, Foursquare and others. Their daily routine and meetings are known by Facebook, Google with professional aspects also known by Microsoft. Users are already giving out their data to numerous parties, in many cases without realizing it. Digital assistants are an explicit concentration of information which as a result can be locked down and controlled more easily by users.
Mixed reality as a concept aggregates all kinds of approaches to let users experience digital environments in the real world. Most prominent are visual devices like virtual or augmented reality glasses, however there are also devices for other senses, like headphones, gloves and other tactile augmentations.
The spectrum (also known as the reality-virtuality continuum) within mixed reality range from total replacement of senses to optional augmentation of specific aspects. Here is a list of the most prominent ones, from least invasive to complete replacement:
- Physical Reality: The state of things as they “actually exist” through our human senses without any technology. Example: Driving a car in “real life”
- Annotated Reality: Using devices to provide information alongside the physical reality, usually context aware to some extent. This can be done via screens, headphones or similar. This is sometimes also called “Second Screen”. Example: Using a GPS navigation device or Heads up Display to support wayfinding.
- Augmented Reality: Technology that seamlessly incorporates digital information into the respective sense, thus combining the digital and physical reality into one coherent perception. Example: Seeing directions integrated into the view of physical reality.
- Augmented Virtuality: Artificially created sensory experiences of people, environments and objects, which can include sight, touch, hearing, and smell, completely replacing the original sensory input, that match the physical reality. Example: Replacing visual perception with digital input, where the simulation matches reality.
- Virtual Reality: Artificially created sensory experiences of people, environments and objects, which can include sight, touch, hearing, and smell, completely replacing the original sensory input. Example: Replacing visual perception with digital input to transport you from your living room to a race track.
While the distinction between these “modes” makes sense from an experience point of view, the technology will eventually be able to switch between them. So for example glasses or headphones will eventually be able to block out external input as well as pass through external input or become transparent. However specialized solutions will remain as a niche segment.
As a result users are able to inject purely digital elements seamlessly into the physical world, creating a believable mixed reality, which is the sum of both. Eventually users will be able to control the level of digital input, ranging from no digital augmentation to fully replacing all input.
Key drivers for Mixed Reality
With mixed reality, digital screens and interfaces will follow your world, extending and changing as they are needed. This will continue to the point where they are not screens anymore, but fully volumetric augmented experiences, which will seamlessly integrate into the real world environment. Compare this to the mobile phone, which we use the same way: We take it out whenever we need it to augment our daily routine with communication, wayfinding, information and other things. But instead of having an extra screen, imagine the phones’ output as an overlay over your natural vision. As mobile phones became ubiquitous in our life, mixed reality will do the same thing.
If smart physical environments with digital abilities and behaviors exist, then users need to be able immerse themselves in this physical-digital hybrid world in a tangible and very relatable way. As these systems interact with environments, personal scenarios might have inadvertent effect on the physical reality, for example allowing or blocking access to rooms or buildings. Thus it is vital that users understand their impact in these digital-physical environments. And this brings us to mixed reality.
If all the above is about extending the physical world with digital abilities, behaviors, entities, capabilities, Mixed Reality allows users to immerse themselves fully in this physical-digital hybrid world in a tangible and very relatable way. In that sense, mixed reality is a magic window into these digital layers, making them visible and actionable for users. They do not replace the real world input, but augment it, weaving things that do not exist into the user’s model of reality.
All these trends will converge and lead to a connected world, where the infrastructure as well as the inhabitants are digitally augmented. However as processing power, storage and bandwidth increase, there are fewer limits at how many dimensions can exist at the same time. This means that there will be not only one connected world, but many parallel ones. While the physical reality stays (more or less) fixed, the digital overlay can change as easily as switching TV channels. Some of those channels might be educational, some inspirational, some might feature entertainment, others would be for work and so on.
Depending on the currently selected channel, the usage, behavior and capabilities of physical locations might change as well. A chair is a chair and not a chair as it might be augmented or adapted to be something else entirely in some digital dimension. Access to capabilities might be controlled via interfaces in a digital dimension that only authorized users can see and inhabit. Let’s consider a simple scenario of a workspace:
As an employee enters the work space, her AR glasses would change the scenario to a work one, albeit personalized. The walls would be covered with relevant general news, context and inspiration that can be expanded with a gesture.
The personal digital assistant also changes into work mode, feeding her the most important priorities of the day. New inquiries are triaged by the assistant and discussed with her so she has an outlook of how the day will play out and she can brief the assistant to negotiate any changes.
As she starts working on a specific project, she switches the digital channel. The wall closest to her changes into a huge whiteboard and prominently features insights related to the project. The formerly empty table now contains a mix of physical and purely digital tools for manipulating and interacting with it. As a shared environment this is also available to colleagues within the same project. Personalization options tailor details to each individual employee.
As more colleagues start working on the project, the room fills with either real persons or digitally projected ones. The latter work remotely, however are projected into the work space as the people in the office are projected into their space. This creates a seamless blend of the team, working side by side.
As she relocated eventually to another project, she switches project spaces again, adapting to different spaces, tools and tasks.
If we consider the physical reality changing completely for each individual, we can only differentiate between “Shared versions of reality” and “Personal versions of reality” that can both be manipulated digitally at will. It’s all about enhancing us with a digital layer that we can act and live in.
The assumption is: This offers an experience comparable to living in a hyper personalized theme park where every experience is catering to each individuals needs, feelings and dreams while connecting them to others around them, either through magical physical things or through purely virtual magic.
Here is the “index card” summary of the strategic digital trends discussed. Again, if there is interest in discussing a specific trend further (for example the social / cultural aspects and interplays), I might write a detailed follow up — please feel free to comment or reach out. Take care.
Internet of Things
What it is: Adding network connectivity to physical objects.
What it enables: Everything that can be digital will be digital. Everything that is digital will be more digital. This will continue to the point where every object is digitally connected.
What it is: Provide intelligence-like capabilities for sensors.
What it enables: Everything digital is able to connect to digital services, used to “make sense” out of sensory data, adding the “smarts” to smart objects.
What it is: Digitally augmented objects can largely operate without being reliant on the cloud.
What it enables: Everything digital is going to be mobile, even in situations and environments with no or limited internet access.
Builds on: IoT, ML
What it is: Digitally augmented objects can engage in natural language conversations.
What it enables: Everything digital is able to understand what we say and do and able to talk or otherwise naturally interact with us in return.
Builds on: ML
What it is: An accurate digital replica of a digitally augmented object, directly linked to its physical form.
What it enables: Everything physical also has a digital layer, which can be used interchangeably: Where changes in the physical form translate into the digital dimension and vice versa.
Builds on: IoT
What it is: A personalization layer that knows about the user, its context, its specific needs and relationships with their environment.
What it enables: Everything digital benefits from understanding personal context to make sure that it interprets the current context correctly and does not act against the interests of the user.
What it is: Handling chains of transactions based on deep knowledge of the user’s context, which evolves and refines over time.
What it enables: Everything digital is part of a personalization layer that reactively and proactively tries to adapt the digital and physical reality to our benefit.
Builds on: Bots, ML, Graphs
What it is: Allow users to immerse themselves fully in physical-digital hybrid dimensions in a tangible and very relatable way.
What it enables: Every digital aspect of a thing or an environment of things can be experienced and manipulated directly, even without deep knowledge of the technologies involved.
Builds on: ML, Graphs, Twins
What it is: The seamless integration of all the above into one connected world, which can be split up into many parallel digital dimensions.
What it enables: Multiple digital dimensions exist in parallel, some of them shared, some private — some acting based on only one person, some acting for groups.
Builds on: ML, Graphs, Twins, MR