Relationship Design is the profession of crafting and orchestrating digital aspects of our lives into meaningful and personalized experiences. That however begs the question what would actually constitute as a “good” relationship between a human and a digitally augmented thing or environment — and how it could be measured.
A vision of relatable objects and systems, based on Star Trek TNG
In Star Trek TNG it is interesting to look at the level of intelligence the computer displays, especially when simulating virtual agents in the holodeck. The intelligence of these holographic agents is fluctuating wildly, based on whatever fits the desired story line of the respective episode. But this might actually make sense. If we understand the Enterprise as a cloud data center and “the computer” as the main interface to distribute computing power in a sensible manner, it would only allocate as little cycles as needed to fulfill the intended purpose.
In TNG episode 3x06 we for example learn that simulating an agent with the full cognitive abilities of a research scientist takes a lot of energy cycles. Yet when the same program is run in episode 4x16 the agent doesn’t even distinguish between individual users, causing an embarrassing moment for the owner of the holodeck program.
So we could assume that the computer reduced the cognitive cycles of the simulation down to default when the program is not used as scientific research partner. And we could also assume that the computer automatically decides the priority for every request any crew member makes.
Assuming that there are reasons to limit distributed computing power in the future we’d also see that in real life where things might be super intelligent at one time and downright stupid at another. The design question for such systems will then be “How to use the least amount of resources while staying relatable?”.
And that will be one of the main design drivers for local, intelligent systems (like digital assistants or any digitally augmented thing) as well. But how do we measure “relatable enough”?
A model that measures the relationship between digitally augmented objects and users
Some aspects of a good relationship are objective and can be clearly defined, measured and evaluated. Looking at these aspects, they can be grouped into the overall notion of “utility”: How well the object or environment does what it is supposed to do according to the intentions of a specific user.
Other aspects are subjective, meaning they are hard to define, not easy to measure and potential results are open for discussion. These are about how relatable and trustworthy the object or environment is from the perspective of the user — how respectful it is.
Both utility and respect blend into each other and create secondary aspects, which are nevertheless important for the design process. The primary aspects are directly associated with utility (speed, accuracy, robustness) and respect (privacy, transparency, trust) and form the core product personality, while the secondary aspects (joy of use, novelty, attention, tonality and timing) form the character.
Practically, the primary aspects are more defined by the brand and brand values while secondary ones are defined by the product itself and the aspired product promise.
Both also influence each other: The more a user respects an object, the more of its features and abilities are utilized. The more a user can rely on the utility of an object, the more they will feel inclined to respect it.
Note that while you can apply this model to digital assistants and conversational interfaces right now, it’s meant to be forward-looking to where everything that can be digital, smart and autonomous, will be.
Let’s dissect each of these aspects in detail and how we could measure them. While not perfect, it’s a starting point to better understand the user-object dynamics in a future, fully digitally augmented world.
Speed is the end-to-end time from the trigger to the executed action. For digital assistants this might represent the entire journey through the pipeline, from recording the audio to speech to text to natural language understanding to business logic to natural language generation to text to speech, responding to the users intent. This also includes infrastructure times like device boot or wake-up, essentially the time to readiness.
That said, speed can also incorporate a subjective component. For example providing feedback during the processing of a request can positively impact the user’s perception of speed. For text based conversational interfaces this might be a “partner is typing” indication, for voice based systems an interjection like: “Let me think” or “Hm..”. The talking speed also influences the perception of speed, regardless of the actual waiting time.
Also, faster is not always better. If systems react too quickly, it might break user expectations as much as if the systems react too slow. In a sense it’s like an uncanny valley that leads to mistrust if a system seems too quick, too smart, too all-knowing. So while measuring speed produces objective numbers it’s thus important to also establish a baseline range of acceptable results.
Levels of speed
Too effective: The system is reacting too quickly to active requests or proactive queues, leading to negative acceptance as users perceive it as too intrusive.
Perceivably positive: The system is noticeably fast and indeed faster than most users expect, which is seen as positive and does not have negative effects or diminishing returns yet.
Expected: The system speed is in line with most users expectation.
Perceivably negative: The system is noticeably slower than expectations, however still within acceptable range for most users to be considered useful.
Too slow: The system is noticeably too slow and users perceive it as tedious to use, leading to a drop in adoption.
Accuracy means how correct the user intent was fulfilled. Usually this is measured by the defined feature set of the system, where the feature requirements would specify what the system is supposed to be able to do, with accompanied test cases to sign off that the feature works as intended.
This applies to reactive as well as proactive actions by the system and can be tested and verified at scale.
Note that accuracy does not measure what a system should be able to do and what users would expect from the system. It is purely meant to define and measure how well the system does what it is supposed to do, using the assumed way to do it. It “works as intended.”
Levels of accuracy
On point: The system does exactly what it was asked to do or what it should be doing in specific situations. The user perceives the request as fulfilled.
Arguably miss: While the system does something in line of what was asked, it might not do exactly what it was supposed to do. It does deliver some intended value, however it’s a failure in the strictest sense of the requirements.
Clear miss: The system does not do what it was supposed to, or does something completely different. It does not deliver any intended value to the user.
Robustness is how well an intent can be answered if the request is not according to the defined way of triggering it. In other words: How good is the system at dealing with certain levels of ambiguity.
Note that a good outcome might not necessarily mean that the intent was fulfilled, but how well the system reacted. For example an acceptable reaction might be to ask for clarification or another way to fail gracefully that would not create a negative response by the users.
Levels of robustness
Fulfilled: The system is directly able to infer the correct intent based on the incomplete information.
Continued: The system understood the basic intent, however needs further clarification on details, and is able to ask for them in a way the user finds natural and “in flow”. Ideally the user does not even consciously notice the additional request.
Deflected: The request was too ambiguous for the system to even understand the general intent. It decided to chose either a generalized response or one at random that most users find still acceptable and will respond with further clarification.
Aborted: The system does not understand the request and communicates that it will not consider it as valid input. The user is informed, however does not receive any actual value.
Betrayed: The system (for whatever reason) decided to abort the request and not react to it at all. Alternatively it does something random and will not inform the user about it.
Privacy is about being in control of personal information, matters and relationships. It is a spectrum based on what a society perceives to be appropriate to share or be shared in specific contexts. Privacy goes hand in hand with transparency, where privacy covers the aspect that data is gathered and shared and transparency is about understanding what happened with that data after the fact.
Again, it is important to understand that no absolutes exist in terms of privacy. In some societies, government surveillance has become common in public spaces — to the point where the reasonable expectation is that everything that happens outside is in some form recorded and analyzed by government officials. In other societies it is common for private organizations to track everything that is happening within a users property, outside and inside, so there might be the expectation that specific roles or functions can invade personal privacy. The sharing of private information can also be seen as a trade to enable certain scenarios or to offset the price of a product, where users would “pay” by sharing information to enable targeted advertising for example. Our notion of privacy changes as technologies and societies evolve.
This makes the levels of privacy highly dependent on the region and culture the product is introduced in. However they can be abstracted to what the system will record and when and how much control the user has over the sensory abilities of the digitally augmented object.
Levels of privacy
Passive: All sensors are inactive or disconnected until the user explicitly activates them, for example via a physical button that provides power to the sensors.
Bonus: Sensors are only active for a short period of time that is required to understand requests and deactivate automatically afterwards.
Local: All sensor data used or collected by the device is handled locally and never shared with any external service. Note that for all intents and purposes the device can still be connected to the network to send and receive anonymized data.
Shared privately: Sensor data is shared with external services, however all data is encrypted on the device with the users key. The service provider would have no way to read the raw sensor data. Note that this has nothing to do with encryption on the data transfer layer, but instead the actual sensor data and other content is encrypted, up to the point where even meta-data is encrypted or anonymized.
Shared openly: Sensor data is shared unencrypted or can be decrypted and read / used by external services.
Transparency is about how well the user understands what the device does. That means how it understands requests, the information it collects (which is tied to #Privacy) and what happens with the collected information. It is about how much control the user has over their personal data and the data that they specifically create within the system.
Levels of transparency
Obscured: The system does not communicate or otherwise indicate which data points it collects, when it is collecting them and what it does with them.
Informed: The user knows which data points the system collects about them and when they are collected. However it is not transparent about what the actual collected information is.
Bonus: The overview is easily understandable and accessible.
Aware: The user not only sees which data points are collected, but can see all collected data as the system processes and shares the information.
Bonus: This includes system meta data that the user did not explicitly shared, but was created by the system itself.
Bonus: The user can access the overview at any time with any device / input.
Empowered: The user can select and delete individual information from the system that they deem private and do not want to share. This might roll back any traces of the information within the system and change / anonymize certain ML models.
Entitled: The user can see who has access to this data: Systems, organizations and people. They might or might not be able to approve / block requests for their data individually.
Bonus: The system shows a log who accessed the information and when they accessed it.
Trust builds on privacy and transparency and extends them into an awareness of the implications any action of the system has — either on the user, their peers and environments as well as the world overall.
It is about understanding what the system actually does overall, which actions the system took and why. Especially with proactive or rule-based systems it might not be obvious when and why an action was taken. Even if the user created the rule in the first place, they might have forgotten about it since, being surprised when it eventually triggers an action. Trust is broken if the system had unintended consequences.
Levels of trust
Broken: The users can not comprehend the actions of the system and to them it behaves erratic and unpredictable.
Explicit: The system relies on being transparent and private and on the user to verify its behavior and actions. It might include a history of actions and explain what triggered them.
Implicit: The system is proactively transparent about its actions as well as the triggers and original user intent behind them. The user can see a history of actions, understand what triggered them and directly change the triggers and intents if something unintended happened.
Love: At some point understanding the abilities and associated implications does not require constant reinforcement anymore. Levels of trust can be transferred from one object to another within the same product family, ecosystem or even manufacturer / brand.
Again, while not perfect, it’s a starting point to better understand user-object dynamics in a future, fully digitally augmented world. I do have levels for the secondary aspects as well, but for now I’d love to hear your perspective on the above metrics and if you have any models that you use yourself when designing digital assistants / digitally augmented objects.