Many national governments opted for launching nation-wide corona warning apps. Germany’s Corona-Warn-App now counts over 24 million downloads: The higher the adoption rate of the app, the more it can contribute to contain the COVID-19 pandemic by breaking infection chains. User experience is crucial for the adoption rate of any app, no matter if it is a business or consumer grade app. Thus, there was a strong need for design decisions to be supported by empirical data through user research.
The Corona-Warn-App is a mobile app that helps trace infection chains in Germany by notifying users if they have been exposed to COVID-19. To evaluate the usability of the design, the SAP UX Research team conducted one lab study along with several online studies. Whereas online studies could provide fast answers to specific questions, the lab study gave in-depth feedback of a broad sample of the German population. To find out about possible usability issues, one-to-one interviews were conducted under strict safety measures.
Three key challenges
A project of this magnitude has many challenges. The top three were high confidentiality, time pressure, and coordination efforts. There were serious concerns about screens leaking to the public before publishing them on GitHub. Until their release, they were highly confidential. While code can be read only by experts, screens have a great potential to be misunderstood and criticized out of context. For this reason, user research online activities showing screens could not start until the release of the screens on GitHub, about two weeks before the go-live of the app.
To develop the app, SAP collaborated amongst others very closely with Deutsche Telekom, the Robert Koch Institute, the German Chancellery, Federal Ministry of Health, and the Federal Press Office, which meant extremely high coordination efforts. Daily syncs took place with the whole team, including on weekends and public holidays.
Besides the design questions that existed from the beginning, more questions came up during the design process, e.g. by incorporating feedback from previous user research activities. Thus, several online studies were conducted which flexibly needed to focus on details and embrace changes. Hence, user research was able to keep the pace of an agile development environment.
Finally, the app’s complexity and its data volume, in particular the verbal and textual feedback from hundreds of participants, made user research and its evaluation very time intensive.
Recruiting test participants
Other than in functional testing, participants in usability testing rarely need a specific technical qualification. However, they need to be representative for the target user population. The Corona-Warn-App, unlike most commercial products, addresses an entire society. Consequently, for each study, participants needed to be carefully selected to match demographic parameters which are relevant for the respective question. In some cases, specific recruiting agencies were engaged to recruit participants. In other cases, the team could work with employees of SAP and the various stakeholder organizations of the project. For SAP, more than 1,000 employees could quickly sign up for a registration list via the internal SAP SignUp app. From this group, samples of study participants could be drawn in an extremely short amount of time, helping tremendously in rapidly answering research questions.
Text comprehension and acceptance
The first online study with 182 participants from SAP and the externally involved parties took place while the app was still in development. User researcher Bernard Rummel emphasizes the importance of this study: “It was a door opener for user research, and we’ve earned the trust of the entire team: We launched the external version of the study at 4pm on Friday, and on Saturday around noon I could show the first result presentation.” The focus was laid on the onboarding process, the risk cards, and the overall acceptance of the app.
To achieve a comparable basis, participants were asked to self-evaluate their foreknowledge about the Corona-Warn-App in the beginning. Then, click tasks in UserZoom were used to track the navigation taken by the participants during onboarding. As you can see below, it went very smoothly.
Afterwards, a questionnaire with regard to comprehensibility, trustworthiness, probability of use, and recommendation helped to identify possible misunderstandings. The results showed that the green and red risk cards would be effective with regard to behavioral consequences. Also, all the quantitative acceptance and likelihood scores were positive overall.
Yet, some participants weren’t reliably able to understand and switch the status of the risk assessment. A common misunderstanding was that the app could warn in real time about someone’s infection state. Data protection measures were appreciated, but not quite understood by a small, non-negligible number of participants. For example, some assumed that the app would track the location or use contacts in the address book.
“Officialese German is 100 per cent correct, but not understandable”, explains Bernard. “We’ve tested the meaning of different terms by asking what they meant. Then, how that meaning would be described the best.” As a consequence, a lot of rather concrete terminology feedback could be provided to the Robert Koch Institute writing team.
Onboarding and settings
As a crucial part of the app, the second study was again about the onboarding process. If users drop out during onboarding, the app won’t be used. Also, if users don’t allow notifications or Bluetooth usage, the app won’t work properly. So, the question for the 164 participants was: If they take the wrong turn, would they be able to notice, and where would they click to recover? To answer this question, click tasks were being used. After showing screenshots to participants, they were asked where they would click to achieve the desired state of the app. The image below shows an example of a click map resulting from such a task.
Another part of the study was an A/B test with three variants to determine the most intuitive way to access risk detail info.
Following up on misunderstandings discovered in the first study, this study was also used to compare the understanding of critical concepts before and after onboarding in more detail, to assess the onboarding’s effectiveness.
Reporting a COVID-19 test result
To securely report a COVID-19 test result, users need to scan a QR code or, in case their testing facility doesn’t have the capability yet, manually enter a code they receive via phone. As researching the existing literature on the subject didn’t provide enough information about error sources and rates of such a code entry process, the team decided to test it.
In a first round, participants were asked to simply enter PIN numbers of different lengths and groupings. A record-breaking number of 842 SAP colleagues participated in this test within eight hours, providing substantial information about error rates and patterns. Only after the study, the decision was made to use alphanumeric codes to reduce the likelihood of hackers’ guessing the code.
In a second round, the process was mimicked in more detail. Participants were asked to listen to audio-recorded alphanumeric codes, which they would then enter into a prototype running on a mobile phone. Participants for this study were recruited by a specialized agency to match certain age and education profiles. This way, researchers could determine group-specific drop-out rates as well as error patterns and give precise recommendations on which characters and groupings to use.
The fifth study with 84 participants analyzed more details of the onboarding process once again.
A click prototype was used in an A/B test for two possible positions for the “next” button. According to Apple’s iOS design guidelines, the default button is usually on top. However, when there is more than one button, the position of the button changes as you follow the process. This creates a source of error if users keep pressing and accidentally press a button they are not supposed to press. In the alternative design, the “next” button was consistently placed in the same spot on the UI. In the study, the testers found out that the alternative design does not influence the error rate but provides a slight time advantage.
Since this study was conducted after the app’s launch, there was an opportunity to track the development of participants’ knowledge about the app. The “real-time tracking theory” was stated less frequently than in the first studies. Yet again, free-text questions were extremely helpful to find out about further aspects that could be improved, and which weren’t covered in the direct questions.
The concept of the app strongly builds and relies on users’ willingness to report positive test results. Throughout this process, the onset time of symptoms is an important parameter for estimating the risk of others who have been exposed to the infected person. Again, three design variants were tested SAP-internally to determine typical interaction patterns and find the optimal design. Here again, questions about the in-app information helped identify important misconceptions – for instance that indicating symptoms would be mandatory (which is not the case).
Conclusion: user research is no bottleneck
This extremely results-driven project showed that user research can be done very fast and even on a large scale with different methods. “It was a great collaboration and atmosphere despite the high time pressure,” summarizes Bernard Rummel. “Proficiency increased with practice, for example when it came to quickly presenting results as well as developing fast click path and text analysis methods. And of course, reusing parts of studies saved time.”
Besides the time pressure and coordination efforts, working under public scrutiny was new. “The team spirit was extraordinary. We all knew: Only a high adoption rate ensures effectiveness and an added value for society,” reflects Christoph Strasser, SAP’s user research lead for the project. “We are happy to see that user research had a great direct impact on the app’s usability.”
Stay tuned for the second part of the blog which will be about the lessons learned from the usability tests for the Corona-Warn-App. Also, have a look at the blog With Over 24 Million Downloads, Germany’s COVID-19 App Helps Break The Infection Chain to find out about the stories of other teams involved in creating the Corona-Warn-App and watch the accompanying video German Corona-Warn-App: An SAP User Experience Story.
If you don’t have the app yet, you can download it here: