Featured publications

July, 2020

Remote Testing in a Covid-19 world

Nothing new, but different than before… As the Covid-19 pandemic continues to force people to work from home, companies are embracing new ways of working remotely. Fortunately, UX researchers have been doing remote studies for decades, so there are many methods and tools available to help.

Having said this, it’s not business as usual. Most UX researchers have focused on in-person methods, so experience of remote research can be limited. That’s why we have written a series of articles on remote research. This article is focused on remote testing, which covers a range of approaches.

Remote testing is underestimated by many researchers. In-person testing is essential for many studies (especially for explorative and contextual research). However, remote testing has many advantages too. This article outlines what remote testing can offer. That knowledge is important while Covid-19 is pervasive, but also afterwards, striking a better balance between in-person and remote research methods. Wouldn’t that be an unexpected positive effect of Covid-19?

Introduction to Remote User Testing Methods

Before we discuss any details, it’s important to point out that there is a wide range of remote testing methods, and an even wider range of tools and services. To give you a better overview, we’ve grouped remote testing methods into three groups.

1. Moderated Remote Testing

This is the method which most resembles in-person testing. There is a moderator and a participant and they interact in real time. The moderator’s role is to observe the participant as they try to complete the task and ask follow-up questions as they progress. Remote moderated testing can be done using a video conferencing tool with screen share functionality (such as Zoom, Gotomeeting or Skype) and a webcam to capture the participant’s non-verbal feedback.

Moderated remote tests allow you to obtain valuable qualitative feedback which isn’t possible to get from an unmoderated test. If the participant stops thinking out loud the moderator can ask what they are thinking, or if they don’t understand the task the moderator can provide clarification. This makes moderated testing the perfect method for testing early prototypes and/or complex functionality for example.

2. Unmoderated Remote Testing

Unmoderated remote tests are performed by the participant in their own time, without a moderator. The tests use a written test script which may or may not include additional questions. Specialised testing tools such as App Quality, UserTesting, UserFeel or Userlytics are widely used for this purpose. These allow you to share a link to your product or prototype and upload a script with detailed test instructions. While these are remote unmoderated testing tools, they still allow for qualitative research by asking the participants to perform the tasks in Thinking Aloud mode. The tools record this user audio commentary alongside the screen recording and in some cases, the user’s face webcam. Researchers can access this data directly after completion.

Unmoderated remote tests allow you to quickly gather feedback from multiple users. Recruitment is typically quick and multiple users can complete the test at the same time, so there’s no need for moderators to schedule defined session times. The lack of a moderator or venue also keeps costs low. Most unmoderated testing tools can also help you with recruitment of test participants — which makes them easy to set up.

The downside of unmoderated testing is that it doesn’t allow for unplanned follow-up questions or clarifications. This means it isn’t a good method for the testing of prototypes that require assistance or deeper exploration of concepts and ideas

3. Quantitative Remote Testing

User tests are normally aimed at collecting qualitative data, as they should give insight into the ‘how’ and ‘why’ of user behaviour. However, if you are more interested in the ‘what’ and ‘how many’, you can focus on quantitative methods. These are typically more useful if you want to measure usability over time or persuade more data-orientated stakeholders.

Quantitative user tests are always unmoderated and focus on metrics such as time spent on task, satisfaction, success rate and perceived difficulty. Most unmoderated test tools offer a combination of both quantitative and qualitative metrics. Some quantitative testing platforms are specialised for some domains such as information architecture. They include heatmaps, scrollmaps or site maps (tree testing).

Having an overview of the methods is great, but how do you choose between them? In the chapters below we will discuss the strengths and weaknesses of each method. In addition, we will address the recruitment and sample size.

When to use Moderated Remote Testing

Moderated remote testing is a great alternative for in-person research. In many ways it allows for a typical test session in which behaviour can be observed and the moderator can ask questions directly there and then.

The strengths of this method are:

  • You can use prototypes that require assistance: the moderator can prompt actions and intervene if the participant is stuck due to limits of the prototype.
  • You can test in the real context of the users.
  • You can ask follow up questions when you observe an unexpected event or user behaviour.
  • It doesn’t require a rigid interview script, as the moderated adds flexibility.

The weaknesses of this method are:

  • You cannot control the setup at the participants’ location. Get used to unwanted noises (i.e. dogs barking and children interrupting).
  • Testing on mobile devices is tricky: setting-up an appropriate platform to share the screen of the participant’s mobile can be difficult and you don’t see the hand gestures of the participant.
  • Moderation and analysis is as time consuming as in-person interviews and equally expensive. Therefore it is suitable for relatively small sample size.
  • Quality of the data collected is often dependent on the quality of the internet connection (therefore it is unsuitable for participants located in low bandwidth areas). You should check the quality of the participants’ internet connection in advance of each interview.
  • Simultaneous translation streaming can be tricky to set-up. For example, see the illustration below for an actual moderated remote test using Zoom. In this situation, the moderator and participant were speaking Dutch, while the notetaker and client had to listen to the translated audio stream.

Image for post

        Illustration of moderated remote test set-up with simultaneous translation

Remote moderate testing is best used in the early stages of development. Your prototype probably requires guidance, and the unknowns on the motivation and understanding of the user may require investigation in the session. It is very good at answering the question ‘why’ and getting feedback on the motivations and behaviour of the users.

Image for post

When to use Unmoderated Remote Testing

Sometimes the speed of the project, the need to test a large sample or budget constraints work against the use of moderated methods. In those cases it is possible to use unmoderated remote testing, using a variety of online services and platforms. They offer easy reach to a wide audience of participants who can work in a parallel, joining the study when they are available. This makes the data collection phase extremely fast, as participants can complete their sessions simultaneously.

Unmoderated Remote Testing: participants are instructed to perform the task verbalizing their actions, impression and feelings (Thinking Aloud method). The audio/video of the session is recorded and collected for later analysis.

The main strengths of Unmoderated Remote Testing are:

  • It is possible to view the audio/video recording of user actions and comments to identify the reasons for behaviours and metrics.
  • It is possible to get a feeling of the participants emotional status and their frustration from visual clues of their expressions (if you have a webcam) and voice
  • Data collection for large samples is faster than with moderated testing
  • Recruitment can be easily done through crowdsourcing platforms (where participants join on a voluntary basis and are only checked by a screener questionnaire), recruitment through this method can be surprisingly quick (days rather than weeks)
  • Multi-language testing can be conducted by a single team, as they can be supported by translators to localise the test and also translate verbal comments on completion. Note — some understanding of the local language is still needed to check the translation)

The main weaknesses of Unmoderated Remote Testing are:

  • It typically requires the same amount of time to analyse as moderated testing, as the researcher has to watch the session recordings. It is possible to only watch some of the sessions (based on a review of the performance metrics recorded), but this does compromise analysis quality.
  • It is only suitable for prototypes that don’t require assistance as there is no moderator to explain or provide guidance. When the participants get stuck they will abandon the task without the possibility of recovering them.
  • Post task and post test questions are pre-set and cannot deepen the understanding of events just recorded in the session. Unforseen behaviour cannot be questioned.
  • It does require a serious preparation of strict questions scripts (no flexibility of the script), this is time consuming.

Image for post

When to use Quantitative Remote Testing

Quantitative Remote Testing: participants perform the tasks and only their clicks and interactions are recorded as well as post-task and post-test questions. Typically there is no audio and video recording of the session and doesn’t require lookback analysis.

Several remote testing platforms on the market allow remote participants to complete the test sessions, on their own, without the need of recording the session:

The main strengths of Quantitative Remote Testing:

  • The test is completely automated, therefore there is a minimal amount of work to be done to collect the data.
  • Analysis and reporting can also be partly automated recording clicks and navigation paths as well as performance metrics (such as time on task) and participants’s self reporting metrics
  • Testing platforms can support making sense of data by using algorithms to ease the data analysis.
  • You can aim to quantitatively measure performance and errors, on a statistically significant scale, since a bigger sample doesn’t translate in a longer amount of time for analysis.
  • Quantitative data can be persuasive to business stakeholders, either by itself or in combination with qualitative data.
  • Quantitative tests allow for reliable comparison of test results over time (benchmarking).

The main weaknesses of Quantitative Remote Testing:

  • It is not possible to easily determine the motivations and the reasons behind the detected behaviour of the participants. For example, you can discover how many participants clicked on the back button at the payments page. However, you wouldn’t know whether they did it because the desired payment method is unavailable or because the call to action on the page isn’t clear.
  • The lack of moderation and live video recording limits the insights you can collect about participants’ opinions and impressions to the questions you have designed in advance and built in the test. One possible solution would be to include open ended questions in the post task and post test questionnaire, however this would lengthen the amount of time needed for analysis.

Purely quantitative automated testing doesn’t include thinking aloud and video recording. It is entirely different to other remote testing methods and not a replacement for moderated testing. It is useful when the scope of the project is limited and when you have simple questions for straight answers. It is more suited to the later stages of the development process, when design is nearly done and there are a few straightforward outstanding questions about features.

Other forms of quantitative unmoderated testing are those used in Information Architecture Design such as Card Sorting and Tree testing. In those methods the goal is to determine the Information mental model or adherence of it to an Information Architecture, so the lack of user comments and verbalization is a small sacrifice, compared to the advantages of an automated cluster analysis of the results, collected from a large number of participants.

Image for post

Recruitment and Sample Size

How many participants should you consider for remote testing? In general the considerations for normal in-person tests are valid: qualitative research can be conducted on a small sample of few individuals up to dozens. This is because you go for quality versus quantity: in-depth insights instead of a statistically significant validation. In this case the best method is moderated testing as you have the guidance of a moderator to better understand and explore those insights.

If you have a better understanding of the problem already, but you want to collect articulate feedback from users, you can also choose unmoderated testing if the platform allows recording Thinking Aloud. In this case you also want a relatively small sample size to keep the analysis effort, listening to recordings, as short as possible. If you are investigating a particular event or behaviour that can be tracked (e.g. a click on a call to action), you may need a large sample to allow you to filter the participants you want to focus on. There are some unmoderated remote testing platforms (i.e. UserTribe, AppQuality) that offer to crowdsource the researchers to delegate listening to Thinking Aloud recordings. These platforms are effective time savers, but they usually come with a price tag comparable to doing it yourself.

If your research question is simple enough to be answered by remote unmoderated testing, without session recording and you would like to use a large sample size, you should definitely consider quantitative testing. This is particularly indicated when you want to compare precise performance metrics of alternative solutions. In this case a large sample of tens or even hundreds of participants is valuable to get significant results.

Most modern remote testing tools include recruiting services, either based on a panel of testers, or by fresh recruitment through social media (crowd recruitment). This provides fast recruitment (a matter of days instead of weeks), but quality can suffer if you are not careful. We advise taking extra care in designing a screener questionnaire to exclude participants not relevant for your study (with clear exclusion criterias) and make it mandatory to get through the questionnaire in order to be admitted to the study. You might want to build in some redundancy in your sample for safety.

Conclusion: add remote testing to your mix!

User testing is too important to ignore in a Covid-19 world. Therefore, many UX researchers are seriously considering remote testing for the first time. Fortunately, there are well established methods, tools and services.

This article has shown what remote testing can offer, especially compared to in-person testing. And in a Covid-19 restricted world, the advantages of remote testing become even more apparent. We urge every UX researcher to become (more) familiar with remote testing. Whether it is during the Covid-19 pandemic or afterwards, remote testing is a great addition to your toolbox.

We hope that this article has increased your awareness of the available methods of remote testing. Do not let the complexity of advanced methods scare you, the basics are actually not that difficult. After all, it is just another way of doing user research!

If you found this article interesting, please have a look at our other articles as well. They focus on other aspects of doing UX research in a Covid-19 world.

Thanks for reading. Be safe and please share your comments with us!