Testing multiple concepts – Insight FAQs

Metaphor - two feet standing in front of lots of arrows - which way is the right wayPosted by Ray Poynter, 4 March 2021


Earlier this week I was asked about ideas for how to conduct a quantitative test for a series of alternatives (e.g. concepts, ads, messages etc). Here is an anonymized version of my reply.

The request wanted to conduct test that provide diagnostic information about each concept and also wanted a comparative metric, allowing the concepts to be ranked.

My four key points are:

1) If you want a comparative effectiveness metric (for example, likely to buy), ask this before asking people to comment on or rate aspects of the stimuli. There is lots of evidence that rating the elements of products, ads or concepts changes the evaluation AND makes the evaluation less predictive.
(For example, Thinking Too Much: Introspection Can Reduce the Quality of Preferences and Decisions, T Wilson & J Schooler, 1991)

2) Ideally, your tests should be monadic (at least in terms of the evaluative measure). This means using a larger sample than if you use a sequential method. Sequential methods are less realistic for the participant because of sensitizing.
(Note, a monadic test is one where each concept is seen by a different sample of people. A sequential approach asks the participants to see, assess and rate several concepts, one after the other.)

3) If (for sample size reasons) you need to use a sequential method (i.e. people see more than one execution), rotate the order that concepts are shown. At the analysis stage, compare the results for each concept in terms of people who saw it as their first concept with those who saw it later in the rotation. If there are big differences, ask the client for more time/money to increase the sample size.

4) In terms of the evaluative question (for example, likelihood of trying, using, buying etc) try to use one that you have benchmarks for. If you do not have benchmarks, use a 5 or 7-point scale and only consider the top two values in your metric – and accept that your results are relative, not absolute.
(Relative results suggest that, for example, Concept A is preferred to Concepts B and C. Absolute measures, based on benchmarks and modelling, try to estimate the actual, in-market, effect of the concept.)

Your thoughts?

This was my suggestion, what would you add, delete, or change?