I worked with Fran Riádigos, founder of ChattyLabs, on a conversational interface that allows users to listen & reply to their instant messages aloud while driving.
In respect of the mobile app localization, we decided to support it in English, Spanish, German, French, Italian and Portuguese based on the mobile app downloads.
The challenge here was to design a conversational interface in the simplest and most natural way as to enable users to interact with the app without even being aware of doing it.
The context (while people are driving), the accessibility (difference responses of the system according to genre) and the limitations of technology (Google Cloud Text-to-Speech doesn’t recognise so well the natural-sounding speech with strong accents) were the main challenges.
We had to take into account multiple contextual conditions: first, the environmental conditions as the noise level of the engine and others such as wind, traffic, car horn or sound of the street when windows of the car are opened. Second, the multiple actions that the user could be performing at the same time such as making use of other apps like Google Maps, interacting with other people or taking an incoming phone call.
In order to make it easier for the system to synthesise the text, we decided to limit the interactions and use short keywords.
Defining User flows
First of all, we designed the Read message user flow. Although we had to consider all technological processes involved, for instance, don’t read the message if the user is in a call, it was a straightforward process.
Everything became more complex at the time of defining the Reply to message user flow. We considered different approaches as to enable users to actively send a message themselves… At last, we ended up giving The App an active role in the conversation to limit the options and control errors.
We did some brainstorming about the hypothetical situations could happen: pretend you are with your mother in the car and your partner texts you saying that the most horrible thing could happen this weekend would be to see your mother 😅
We also worked hard in translating the conversation. It was not only as to make the app accessible across countries, but to make it sound natural.
Besides that, we added a Custom notification in the Lock screen to let the user stop the messages reading at anytime.
First, we wanted to see how the users reacted to the dialog. We did a Face-to-face user testing where I acted as The App just saying the sentences aloud. There was no errors and users seemed to be very pleased.
Next, we tested whether users were able to reply to a message. You could feel the users’ uncertainty. Most of them didn’t know what to say in order to reply. This along with some errors ended up making them feel vulnerable.
We uncovered also some problems with the TTS when it came to recognise female tone. That caused a lot of frustration among the female participants.
Iteration on Design
To solve this, we decided to help users by telling them the command voices they should say to interact with the system. We also added a confirmation of each of the actions that users performed. We decided to use short and simple command voices as to help the system to recognise them.
It might look like we were adding more info than needed, but we realised when we see people using the product in live that it was much easier for them and they felt much safer and more comfortable.
After the iteration on the design, we increased the number of messages sent successfully by 75%.
We tracked the user flows to Send & Reply to messages to find out whether users met their goals or they got stuck in the process, and, if this was the case, to identify where.
Most of them were very satisfied and gave us a lot of ideas for new features such as adding more info to the message or asking for a confirmation before sending it.
You really need to take into account the user context and technology constraints at the time of designing usable conversational interfaces. Some technologies are simply not ready for interacting with them in an open environment so you can not give to the user all the freedom would be desirable. And at this point is when UX plays a crucial role as to figure out how to deal with those limitations.
When it comes to design voice user interfaces design patterns are slightly different, for instance, users found easier the interaction with the app when we say the sentences over and over, even if it might look a bit repetitive. That was the first step to onboard users about how to interact with the app. Also is a metric to discover at what point they get used to the interaction and ask for more advanced options.
To do the user testing of a conversational interface you can start by saying the sentences yourself, no need of having neither a prototype nor the app.
To build all the conversation and connect all services needed for the interface to work is a huge development effort. And it’s a risk to release an app that doesn’t have a good user experience. You can try the app yourself to identify the potential user pain points related to the technology itself and try to overcome those technological constraints in advance.
It’s crucial to do the user testing in context in order to test how the app performs in harsh conditions and also for uncovering new user scenarios.