Why recommendations are the Product Manager’s worst enemy

author

Odi Paneth

December 20, 2021 | 7 min read

Right before we start, TL;DR:

  • If you manage products that recommend anything to your user, this article will help you.
  • Recommendations are suggestions from the product to the user on how to make the most of the product.
  • Recommendations are usually helping our users by focusing out of a selection and optimizing their workflows.
  • User trust and sense of control are the main two factors we need to keep in mind while working on recommending products.
  • When recommending, give more than one option, explain why you recommended this, allow the user to respond, and make the recommendation as human as possible.

The real issue

Product management has a real problem that we are not talking about: we hate giving recommendations in our product and even more than that, we hate explaining why we gave that recommendation.

Why am I saying this? Just a few days ago at a big product conference (online, of course), some colleagues in a panel got into a discussion about recommending the best course of action to our users. I was expecting the discussion to focus on recommended best practices, but it quickly escalated to what goes wrong when your product gives recommendations.

My colleagues spoke about backlash: ”The customer did not follow our recommendation”, “The customer told us the recommendation is not accurate”, “the customer did not understand why I am recommending this”. Though eventually, the average user actually tends to change his usual pattern based on the recommendation, and gain more value yet somehow the NPS scores are lower than before.

I immediately remembered my experience in a past company and the hardship we have when trying to produce recommendations-based products (we referred to recommendations as “insights”) for bar managers.

But first, what problems do we even solve with recommendations?

Recommendations in a product are suggestions that the product produces (using a recommendation engine or any other tool to generate them). These recommendations can be the center of the product or a feature in a product. These could be created by extremely complex systems that customize the recommendation according to a lot of data or could be completely “dumb”.

In the B2C market, we are really familiar with selection recommendations. Netflix, for example, knows our preferences and shows us the most relevant choices according to our taste. Another example is Amazon’s “Frequently bought together” feature. These selection recommendations are usually meant to improve the user experience by focusing the selection according to the user’s taste. The product seems to predict what you want to see or need to buy and offers it to you. This recommendation type can be tested by NPS.

Amazons "Frequently bought together" selection recommendation

In the B2B market, however, the situation is a bit different. Companies offer their clients tools that, if misused, would probably cost a lot of money. The recommendation solves a problem for the user in his daily work. For example, campaigners in the advertising world use insights tools on their raw data to produce insights. If we listen to an insight produced by a bot/AI engine, etc., we put a lot of trust in these tools to produce an equal or better outcome than what we could have produced. This is what we might call optimization recommendation: we are attempting to optimize a variable (revenue made, clicks, sign-ups, etc) using data we have calculated in some kind of algorithm or AI-based engine. This recommendation type can be tested (mostly) by retention metrics.

There are other recommendation types, but these are the main two types. They can each appear in B2C, B2B, and internal tools.

Selection improves the user experience, Optimization is improving a working metric

Why do my recommendations keep getting rejected?

So from my personal experience, this is all about Trust. The user doesn’t necessarily trust you, the delivery method, the data, AI engines in general, or any other component in the system that might stick out.

From a personal perspective, at one of my past jobs, I tried to build an insight engine that would tell my user, a bar owner, what action to take regarding wasted beer (we used cool IoT devices to measure draft beer consumption). Even when having strong confidence in your insight, we found that the client rarely felt the same. Responses such as “I know my bar best” or ”you don’t know how I run my business” came up a lot, and we had to use a human mitigator to deliver automatic insights. We found that the same insights delivered by a human were significantly better received and committed to by the bar owner.

Other PM’s experiences were slightly different. A PM from a successful B2B startup offering predictive analytics for maintenance of hardware mentioned that repetitiveness and false alarms were the most churning and trust-damaging issues with their recommendations. User engagement dropped and support tickets soared after the second false alarm given by their product, though the product’s precision was much higher than human precision.

It’s not even just about the data; the slightest mistake can imply to the user that the recommendation engines are (obviously) not perfect, and even visually broken recommendations, or misspelled sentences, can cause mistrust.

A missing logo got the recommendation rejected

The customer uses my recommendation, but NPS is still low

This is usually related to agency — the user’s ability to feel in control, powerful, and capable.

The first option is that “the machine” has taken the user’s agency in a way that makes the user feel powerless or does not rule his destiny has little or no control. Imagine a help desk agent that is assisting customers with their software issues replaced by AI that can understand what is the customer’s request very quickly, and the help desk agent (the user of the engine) is left with the task of writing the reply email and verifying the AI engine. The tech agent would feel weak next to the recommendation engine and most likely would object to using it in general.

Another option is that the user cannot do anything with the recommendation, the user is powerless to react to the engine’s recommendation. For example, while working for a gaming app company, I tried to create an engine that can predict hardware issues that hurt the gaming experience. When suggesting to the gamer to stop some resource-heavy processes that were not game-related, we had to hide critical processes that are hurting the game performance, but the user could not shut down. If we had shown recommendations that the user could not have applied (even if completely correct), the user would have had the feeling of losing agency and would not have continued to use the product.

What works

Try to show a selection of recommendations: this appears a lot in B2C. If you can, make sure you have more than two recommendations (but not too many!).

The weight of hitting a hole-in-one recommendation is massive, the more recommendations you add, the less weight a single recommendation has. To gain a happy customer you usually don’t have to be accurate in every single recommendation, but just one out of many.

In B2C, we can clearly get the “recommended for you” on YouTube. They compile this list from videos that are similar to what you are watching now, as well as your history, and what other users that watched what you are watching have liked. YouTube wants to increase your watch time, and this selection helps YouTube mitigate the risk of a recommendation that hurts the user experience.

A lot of other video playing websites (especially news related) prefer auto-playing the next recommendation without any user choice and with that, they are betting this session on one single video. We would call this a segway recommendation: the user is clearly happy with the last content viewed, and using the trust built on the content, the engine offers another related contact without stopping to get the user’s confirmation.

A selection recommendation by Email from Amazon's Audible, using an authors name as a poster child

Create a “poster child” to explain your engine’s selection: some companies use the poster child method, in which they show one easy to explain parameter that helped them recommend a specific choice.

For example, in Amazon’s Audible, you get weekly recommendation emails that start (in the title!) with “If you enjoyed [autors_name]” and lead to a selection of similar books, hooking me with a poster child of a book I have already finished to a new recommendation.

An email I received from Audible after finishing “The Way of Kings” by Brandon Sanderson using the author as a recommendation poster child.

The poster child is also used by the recruitment platform Gloat when it highlights certain parameters that led them to believe this job recommendation is a good fit for you. For example, “Join coworkers from Overwolf, who are there right now”. Gloat mitigates my mistrust by telling me that workers with a history like mine (who worked in the same workplace) also work in Playtika, thus explaining to me why they are suggesting this position for me.

Here Gloat used coworkers from past employers to explain the AI's choice

There is another level of complexity in Gloat’s recommendations. Specifically, Gloat claims to use AI to gain these recommendations. Most AI engine (especially neural networks) choices are not very intuitive and very hard to explain, meaning that Gloat probably has to mitigate some very odd recommendations with very general poster children, and have a whole engine that just produces these “poster child”.

Allow the user to feedback, even if it’s fakedo you know that most pedestrian crossing buttons don’t even work?

Well, they do work, but not by making the light go green any faster. They work by giving the user a sense of agency, who is then more willing to wait and not cross at a red light. We can thus conclude that letting the user give feedback on your recommendation may increase both their trust and agency. I am not saying that each time a recommendation product requests recommendation feedback it does nothing: In products that are AI-based, for instance, human feedback on the output is critical and usually enhances precision significantly, but this is not the point for now.

If you want to minimize the damage a recommendation can make, add a like and dislike button (or dismiss and approve). Product managers that used this method found that it reduces user backlash for low precision, increases engagement and trust, and raises satisfaction with the recommendation engine.

No alt text provided for this image

Anthropomorphism! Make your recommendation as human as possible: products that seem more human increase our trust.

For example, conversational bots that recommend actions to their users are more likely to be used if they introduce themselves with human names.

This does not mean you have to lie to your user and claim that the recommendation was made by a human, BUT!, it would help make the recommendation feel like it was created by a human. For example: adding emoticons and changing the copy for consecutive recommendations so the recommendation won’t feel “robotic”.

So In Summary (same as TL;DR):

  • If you manage products that recommend anything to your user, this article will help you.
  • Recommendations are suggestions from the product to the user on how to make the most of the product.
  • Recommendations are usually helping our users by focusing on selection and optimizing their workflows.
  • User trust and sense of control are the main two factors we need to keep in mind while working on recommending products.
  • When recommending, give more than one option, explain why you recommended this, allow the user to respond, and make the recommendation as human as possible.