Let’s protect the basic truths of market research methodology
I have just been reading an article about social media as a potential replacement for traditional market research on Research-Live and it made me want to scream!!! As the founder of NewMR I am a fan of new techniques, I was one of the first to use CAPI, one of the first to use simulated test markets, one of the first to use online research, and one of the first to use MROCs – and I wrote The Handbook of Online and Social Media Research.
But, there are some basic rules we all need to stick to if we are to assess new tools. We need to be able to tell clients whether they are the same, worse, better, or different to existing tools, when to have confidence in them and when not to. To do that assessment we need to stick to some very basic rules – and the rules are different for qual and quant.
Here are a few of my key rules for quant research.
A big sample is not a population. In the Research-Live article Mark Westaby said, about the UK, “We track tweets from millions of unique supermarkets users, who in fact represent between 1 in 10 and 1 in 20 of all consumers who ever use a supermarket. With these numbers we’re not just tracking a sample but the population itself.” NO!!! 1 in 10 is a sample. When we use samples we are using the 1 in 10 to help us assess what the 9 in 10 do – sometimes this works and sometimes it doesn’t. If the sample is a true random probability sample, it usually works, but it usually is not a random probability sample. The 1 in 10 left handed people in the UK would not give you a good guide to how the 9 in 10 use their hands. But, the 1 in 10 people in the UK with ginger/reddish hair would provide a pretty good assessment of beer, breakfast, and car preferences.
Causality Matters. Chris Anderson, from Wired, Free, and the Long Tail has said about big data that we won’t need to worry about the scientific method or causality once we have enough data. Nate Silver demolishes this in his book The Signal and The Noise. A model and an understanding of causality is more important when the amount of noise (and most big data is noise) increases.
Causality is more than a sequence. Every day I eat my breakfast and later the day becomes warmer, so, eating breakfast causes the world to heat up. Causality requires a model, in most cases it can only be tested via a controlled experiment.
Extrapolation is much less reliable than interpolation (inside the box is better than outside the box). This is true at the mathematical level, it is more reliable to fit a curve to a set of points and then to work out the spaces in between, than to estimate where the line will go next. But, it is also true for consumers answering our surveys. How many times will I eat breakfast next week? Easy question, inside the box, i.e. interpolation. How many times will I eat a burger next month? Not as easy, but I can give an estimate that will be close to the average of what I have done in the past. How many times will you go to the gym over the next 12 months with the new gym membership you have just bought – outside the box, you might be right in your estimate, but you probably won’t be – this is outside the box, i.e. extrapolation.
One test can disprove something, but it can’t prove something. If I test a new method (say social media or mobile) and it gives the same result as a method I feel is correct, then one of three things is true: a) the new method generally gives the same results as the old method, b) the new method sometimes gives the same result, or c) it was pure luck. More tests are needed, and the tests should be designed to show whether a), b), or c) is the most likely explanation.
All too often MR sees a study comparing two approaches, finding few differences, and implying that the two are broadly comparable. No! The test shows that they are sometimes the same, but we can’t tell whether that is often, sometimes, or rarely true.
By contrast, if two tests are run and they produce a different result, then this would tend to disprove the idea that the results are broadly comparable. However, it does not disprove the contention that the two methods are comparable under some circumstances.
If two results differ it does not mean one is right. Quite often when a new method is tested, say online versus CATI, and a difference is found the implication is that the established method is correct and the new method wrong, or less commonly that the new is right and the old wrong. However, there is also the possibility that both are wrong.
More data does not always help. Nate Silver highlights this issue in the context of predicting recessions. There have been 11 recessions in the US since the Second World War. Silver quotes 400 economic variables that are available to model the causes and predictors of recession. With 400 variables and only 11 cases there are millions of possible solutions. As researchers, we would prefer there to be 11 variables and 400 recessions. More cases usually help, more variables only help if they can be organised, structured into a model, and if after the processing the number of cases exceed the number of variables.
So?
We do need and want change. New ideas should be encouraged, but in assessing them there are a few basic rules we need to adhere to. It is fine to try something untested, provided one says it is being tried untested. It is fine to be encouraged if a trial shows few differences from a benchmark. But it is not fine to say the technique has been proved, nor that the scientific approach to proof does not matter.
Suggestions?
What are your suggestions for basic rules that should be adhered to?
6 thoughts on “Let’s protect the basic truths of market research methodology”
Comments are closed.
I saw the Research-Live piece, too. The social media argument was so silly I could not get worked up enough to even comment. In other words, his underlying assumptions were so ridiculous that his results were meaningless to me. But to give the devil his due, at least he stated his assumptions which is more than we often get with new methods. So there is my rule. Don’t just show me empirical results (which are important). Also clearly state the assumptions one has to accept for the results to represent whatever they purport to represent. Empirical results aren’t everything; theoretical arguments still matter.
As you Brits like to say, “brilliant”. Don’t forget, you can’t test breakthrough creative [sarcasm].
Rock on Ray. Most people have only a dim understanding of numbers and statistics. My knowledge is barely better than dim so it’s always good to get an in your face reminder. 🙂
I think you’ve misunderstood my comment, Ray. Put simply, we can draw sub-samples from our massive sample that would otherwise be treated as outliers; and by analysing these comments we can determine things that traditional methods wouldn’t have a cat-in-hell’s chance of finding. How do we know this works? Because we can track our findings against completely independent data to verify and validate it. As a result we’re able to help brands understand things that have previously been complete mysteries.
Btw, I completely agree with your comment about causality and that Chris Anderson is wrong. But that’s not the way we do things.
Hi Ray, thanks for this
I am dipping into Order from Chaos by Ilya Progogine.(its an old paperback but deals with fundamental stuff)
It charts the historical evolution and meaning of science from Newtonian to the Complexity and Quantum sciences of the 21st Century and the relationship between Science and Culture.
I do not profess to understand >50% of it, but sometimes a passage really sings out with relevance for more prosaic subject of market research. This was one that I think relates to your points about the need for a model:
“”Whatever we call reality, it is revealed to us only through the active construction in which we participate. As it is concisely expressed by D S Kothari, “The simple fact is that no measurement, no experiment or observation is possible without a relevant theoretical framework”
Thanks for calling out the basics in the context of our work with understanding how customers chose, We should discuss more about our models, our thinking frameworks, and how we construct them.
Absolutely. Thanks You. People consistently forget that though the MR may be new the fundamental statistical principles are the same!