Produktdesign | UX / UI | 3D | Visual Art


Your good coffee finder— Pre-order, Pay on the go, Pick up

In this project we were challenged to ascertain, which requirements needed to be met to design a voice-enabled interface that is easy to use, convenient and meaningful. Several questions had to be dealt with, such as: How does language change the design of graphical user interfaces? Which criteria assess the quality of the interface? When and where does a screen make more sense and when does voice?

01 — Research & Ideation

To understand how voice assistants (VAs) are being used, we created a survey asking about general user habits—how users interact with their assistant. In addition to that, having small interviews, reading several articles and developer documents for and about Alexa & Google Assistant helped us understand the design-possibilities. We understood that VA is currently being mostly used for simple tasks (controlling smart devices or checking weather) and search queries (Google or locations in Google Maps).

First Idea

Based on our desk research we decided to tackle the latter, more precise: search for restaurants, and also try to extend it to be able to order food or coffee and reserve seats. Also, inspired by the Google Duplex Demo for Google Assistant at Google I/O 2018, we thought of the question: What if we turned it around and there were “branded” VAs for different venues picking up the phone—each with their unique voices?

Based on these aspects the app/service should make the following possible:

  • User has the possibility to order and/or reserve through the VA
  • The venues VA handles the users queries of ordering and reservin: reducing the workload of the person normally picking up
  • Each VA from the venue or restaurant chain should have a unique voice created: the user should be able to identify it—connecting it with the brand of the venue or restaurant

Survey & Interviews

In order to test our idea we first started with a survey to understand general user habits.

The survey consisted of two parts—two scenarios—about using the VA in general, and communicating with a “branded” VA. Additionally, we also asked them about their feelings regarding the idea to communicate with a “branded” VA.

About 43 people filled out the survey and the most interesting results were as followed:

  • 80% prefer using a female VA
  • 59% are using the VA while commuting or in hands-free situations
  • 83% would find a branded VA interesting
  • VAs are mostly used for setting reminders/timers and searching things on Google

Personal interviews were also part of finding out what people feel and think of VAs in general and branded VAs. Here, we also had the opportunity to test out small scenarios of ordering or reserving with the assistant:

  • The VA should be able to learn from user demands and give right suggestions when necessary
  • Conversations with the VA should be efficient but not sound impolite
  • Ordering or Reserving should work seamlessly with voice but the user still wants a visual confirmation of payment or reservation afterwards

Market Analysis

Furthermore we did a rough market analysis, where we mainly analyzed several apps and services with similar features and functions to our idea.

Establishments like Starbucks or McDonalds have services that let you (pre-)order food or beverages but the more interesting ones were those that offered more variety from different restaurants or coffee shops.


With OpenTable the user is able to make reservations for restaurants based on which time he/she wants to arrive and how many people are joining. It will then suggest time slots of when the table is vacant.


Skip is an Australian based app that lets the user pre-order food or/and beverages from different establishments. The user then pays through the app and is able to pick up his/her order.


pickpack is a new service based in Berlin, Germany and has a similar concept to Skip. In addition to Food, it offers more products like Books or Fashion.

With this we were able to confirm that there is interest in these services and that we were at the right track with our concept.

Target Group

From what we gathered so far we defined our target group as people who like the process of getting food or drinks to be more streamlined and efficient. On occasion, they want to use the service with voice—while also doing other things.

Idea Definition

Pre-ordering and reserving should be the main focus of our app/service. Furthermore, the user should be able to interact with it through a voice interface without relying on a screen. This opens up the possibilities to interact with other voice-enabled devices (e.g.: Smart Watches, Smart Home Devices, etc.) in different situation—especially when the users hands are not free (e.g.: while Cooking, Driving, etc.).

02 — Testing & Prototyping

For the first test phase we developed dialogues for the interaction with the VA’s. We tested the conversations with the help of the “Wizard of Oz” method. For the VA, we used the voice output of a Macbook through a Python script that let us play predefined phrases via the input line. Since we initially had a concept where the user had to communicate with two VA’s, the second VA was spoken by Dustin. The test users were fellow students.

Involved in the dialogue were the user, the Google Assistant (GA) and the branded VA of the coffee or restaurant.

We tested two use cases in two versions of dialogue length each:

  • Order a coffee (in short and long dialogue).
  • Reserve a table (also in short and long dialogue).

The long dialogue was designed for first-time users and the short one for users who already have experience with the app. The dialogues were arranged in such a way that the user would speaks with the Google Aassistant first. The GA then forwards the user to the branded VA.

Dialogue 1
Ordering coffee
—Long version

Dialogue 1
Ordering coffee
—Short version

Dialogue 2
Reserving a table
—Long version

Dialogue 2
Reserving a table
—Short version

Our testing showed that having a conversation with two VA’s was cumbersome and unnecessarily complicated the process. Realizing this, we had to rethink our concept and and simplify the interaction.

Change of Concept

After much feedback we realized that we had to change our concept. Relying on a voice-only interface with long flows of dialogues was ridden with many things that could go wrong or be annoying for the user:

  • For many the dialogues might be to long—where a screen would be more efficient
  • The possibility of the VA not understanding the user is still real—the user will rather fall back to the screen

We decided to focus our efforts in designing an app/service where the user can rely on both the screen and the VA. Furthermore, we kept the idea of only pre-ordering and picking up coffee. Additional interviews helped us redefine our target group as well.

Redefining Target Group

Our target group hadn’t changed much apart from the fact that they enjoy good coffee but also want it quickly. They dislike standing in long lines, since they are often in a hurry and need to get to an appointment quickly.

03 — Final Product

We created an app and service that enables the customer to pre-order a favourite coffee, pay on the way and then pick it up at the designated coffee-shop through a voice-enabled interface.

Instead of letting the user search for a desired coffee shop on a map that serves great coffee our app searches for a shop based on the users prefered type(s) of coffee(s) instead. The main possible benefit here is to be able to get a good coffee quickly while reducing screen time on a map.

This has especially positive implications when using the app hands-free and only with voice—which makes the map obsolete in certain conditions. Additionally, voice based interactions should also be easier since the volume of information from auditory search queries is reduced to a minimum.

We called the app Kupp.

Application Structure

Creating an application structure helped us understand which features needed to be implemented and which conditions needed to be met for certain possibilities of interaction to be available.

The following key features needed to be implemented:

  • Explain how the app works through onboarding. The user needs to be able to easily create a coffee preset. The types of coffee and their small derivations need to be the most common ones every coffee shop has.
  • The user needs to be able to find the closest/best coffee shop based on the activated preset (and—if desired and connected with the calendar—also on the users saved destinations from upcoming schedules).
  • Payment should happen through the app on the go—the user shouldn’t have to pay at the coffee shop. For this there will be a radial area surrounding the coffee shop: The size of the area should always be as large as the maximum amount of time the user needs to walk to arrive at the shop when the coffee is ready for pick up.
  • When picking up the coffee, the user should be able to be identified to get the correct coffee in a quick and easy manner.

The structure of the app is separated into three sections:

  • Onboarding/Walkthrough: Explaining the app and the idea behind it to the user
  • Coffee preset view: The structure of how the user may create his/her favourite coffee
  • Map view: The main logic behind selecting a coffee preset, choosing a coffee shop, and ordering, paying and picking up the coffee

After the user finishes the onboarding, he/she will be directed to the map view where several conditions will be asked in the backend and frontend: e.g. if the user does not have a preset created yet, the app will ask to create one to be able to proceed further OR is the user inside a specified area away from the coffee shop, he/she will be able to pay on the go.

Additionally, apart from the initial onboarding screens, the idea is to enable the user to use the VA and screen whenever he/she desires to do so. We didn’t separate the structure into voice or screen but created it in a way so that the app could be usable in both ways as much as possible. Even special use cases—such as paying with voice or screenless navigation to the coffee shop—are plausible if the technology behind it or the user allows it.

User Journeys & Screens

To demonstrate the app we decided to showcase its features in three parts:

  • Creating a preset
  • Getting a coffee
  • Getting a coffee based on an appointment

Additionally—if the user is already familiar with creating presets—it is also possible to create them via the voice assistant.

After that, we use two personas to help convey the two journeys of “getting a coffee” and “getting a coffee based on an appointment”. The first persona uses the app through the screen only, whereas the second one already knows the app quite well and mostly uses voice.

Audrey is an illustrator from London and is visiting Berlin. 
It is morning. She doesn’t know Berlin very well and is strolling around to get to know it a bit better. She does like to drink her special daily cappuccino whilst walking and decides to use Kupp to look for the closest coffee shop offering it.

The closest shop is “Fat Elephants” and she orders a coffee from there.  The interface changes slightly—adding a red border around the screen—indicating that she has ordered her coffee and is not able to change her preset, unless she cancels the order.
She continues walking an enters the blue payment area.
The interface changes again for her to be able to pay on the go. 
She does so and is now able to pick up her coffee.
In the shop she then shows the barista her QR-Code and receives her coffee. The interface resets itself.

Jackson is a design freelancer based in Berlin and uses Kupp for quite some time. 
His work needs him to commute quite often and he has many appointments with his clients.
He mainly uses voice through his wireless headset to order his coffee and only looks at the app for directions, when he has to pay and show his QR-Code.

He looks at his phone to see where to go.

He proceeds to follow the directions to the coffee shop and enters the payment area.
Kupp notifies him of this with a notification sound. The app also asks him if he wants to pay for his order.
He does so and then picks his coffee up.

04 — Conclusion & Learnings


Even though the idea is to someday communicate with a VA like you would communicate with a normal person today you still have to design with technological limitations in mind and for the today’s users mindset or readiness. Knowing this early on was essential for writing concise and realistic dialogue flows.

Voice — Screen

At first we were adamant about designing an app, which uses the screen as little as possible, but luckily realized early on that a balanced mix of voice and screen UI helped solve many design problems.

Fail early & Keep it Simple

Our initial idea was to add a branded VA to the interface, which opened up many possibilities but also complicated many things. Apart from the technological hurdles it came with it didn’t solve users problems or catered their needs. Pivoting to a smaller and more concentrated idea helped us refocus on the user.

This project was developed in collaboration with Dustin Kummer, Yi-Ruo Lin and Hsin-Tung Chen. I thank you for the pleasant cooperation.

Weiter Beitrag

Zurück Beitrag

© 2020 Dunkelviolett

Thema von Anders Norén