I have just been reading an article about social media as a potential replacement for traditional market research on Research-Live and it made me want to scream!!! As the founder of NewMR I am a fan of new techniques, I was one of the first to use CAPI, one of the first to use simulated test markets, one of the first to use online research, and one of the first to use MROCs – and I wrote The Handbook of Online and Social Media Research.
But, there are some basic rules we all need to stick to if we are to assess new tools. We need to be able to tell clients whether they are the same, worse, better, or different to existing tools, when to have confidence in them and when not to. To do that assessment we need to stick to some very basic rules – and the rules are different for qual and quant.
Here are a few of my key rules for quant research.
A big sample is not a population. In the Research-Live article Mark Westaby said, about the UK, “We track tweets from millions of unique supermarkets users, who in fact represent between 1 in 10 and 1 in 20 of all consumers who ever use a supermarket. With these numbers we’re not just tracking a sample but the population itself.” NO!!! 1 in 10 is a sample. When we use samples we are using the 1 in 10 to help us assess what the 9 in 10 do – sometimes this works and sometimes it doesn’t. If the sample is a true random probability sample, it usually works, but it usually is not a random probability sample. The 1 in 10 left handed people in the UK would not give you a good guide to how the 9 in 10 use their hands. But, the 1 in 10 people in the UK with ginger/reddish hair would provide a pretty good assessment of beer, breakfast, and car preferences.
Causality Matters. Chris Anderson, from Wired, Free, and the Long Tail has said about big data that we won’t need to worry about the scientific method or causality once we have enough data. Nate Silver demolishes this in his book The Signal and The Noise. A model and an understanding of causality is more important when the amount of noise (and most big data is noise) increases.
Causality is more than a sequence. Every day I eat my breakfast and later the day becomes warmer, so, eating breakfast causes the world to heat up. Causality requires a model, in most cases it can only be tested via a controlled experiment.
Extrapolation is much less reliable than interpolation (inside the box is better than outside the box). This is true at the mathematical level, it is more reliable to fit a curve to a set of points and then to work out the spaces in between, than to estimate where the line will go next. But, it is also true for consumers answering our surveys. How many times will I eat breakfast next week? Easy question, inside the box, i.e. interpolation. How many times will I eat a burger next month? Not as easy, but I can give an estimate that will be close to the average of what I have done in the past. How many times will you go to the gym over the next 12 months with the new gym membership you have just bought – outside the box, you might be right in your estimate, but you probably won’t be – this is outside the box, i.e. extrapolation.
One test can disprove something, but it can’t prove something. If I test a new method (say social media or mobile) and it gives the same result as a method I feel is correct, then one of three things is true: a) the new method generally gives the same results as the old method, b) the new method sometimes gives the same result, or c) it was pure luck. More tests are needed, and the tests should be designed to show whether a), b), or c) is the most likely explanation.
All too often MR sees a study comparing two approaches, finding few differences, and implying that the two are broadly comparable. No! The test shows that they are sometimes the same, but we can’t tell whether that is often, sometimes, or rarely true.
By contrast, if two tests are run and they produce a different result, then this would tend to disprove the idea that the results are broadly comparable. However, it does not disprove the contention that the two methods are comparable under some circumstances.
If two results differ it does not mean one is right. Quite often when a new method is tested, say online versus CATI, and a difference is found the implication is that the established method is correct and the new method wrong, or less commonly that the new is right and the old wrong. However, there is also the possibility that both are wrong.
More data does not always help. Nate Silver highlights this issue in the context of predicting recessions. There have been 11 recessions in the US since the Second World War. Silver quotes 400 economic variables that are available to model the causes and predictors of recession. With 400 variables and only 11 cases there are millions of possible solutions. As researchers, we would prefer there to be 11 variables and 400 recessions. More cases usually help, more variables only help if they can be organised, structured into a model, and if after the processing the number of cases exceed the number of variables.
We do need and want change. New ideas should be encouraged, but in assessing them there are a few basic rules we need to adhere to. It is fine to try something untested, provided one says it is being tried untested. It is fine to be encouraged if a trial shows few differences from a benchmark. But it is not fine to say the technique has been proved, nor that the scientific approach to proof does not matter.
What are your suggestions for basic rules that should be adhered to?