As part of writing our new book on Mobile Market Research (which should be available in September) I have been reading a lot of research-on-research (RoR) related to mobile studies.
RoR can provide insights into whether a research technique works or not, or the extent to which it works, or how it works. However, RoR is often over-interpreted. Running a single test does not ‘prove’ a technique works, nor does it very often prove a technique is without merit.
The following observations should be kept in mind when reading the results of RoR. For the purpose of this illustration, consider the possible outcomes of on an experiment with two cells. Each cell examines the same phenomenon (e.g. survey questions) via two methods (e.g. online versus mobile) – calling the methods A and B.
- A and B produce results that are statistically significantly different. This does not mean that A and B will always produce different results. The difference in the results could be the result of chance, or there could have been a flaw in the test. But, even if the test was fair and well-constructed, the difference only indicates that the methods A and B will sometimes produce different results; it does not say they will always produce different results. If the tests are repeated, for different products, with different surveys, and with different sorts of customers, and if the differences between A and B keep appearing – then researchers will start to assume that there probably is a general difference between A and B.
- A and B produce results where the differences are not big enough to be statistically significant. This type of result is often misinterpreted, leading to a false conclusion that the cells are the same. A test that fails to prove that A and B are different does not mean that A and B are the same, or even similar. If a test shows that the probability that A and B are different is 89%, convention dictates that the difference is not big enough for the researcher to be 95% confident there is a difference – so the cells are dismissed as not being different. But it does not mean A and B are the same, indeed in this example the researcher is 89% sure they are different. When a difference is not statistically significantly different, it often means that the sample size was not large enough to confirm the difference as being significant. With a large enough sample size, almost any difference is significant, with a small enough sample size, many important differences will be judged not to be significant.
- A and B produce results that show the difference between A and B is small. Instead of testing for differences, researchers can test whether a difference is smaller than specific value, which in practice means testing whether two cells are the same. Consider a case where a test has been conducted and the results show that A and B produce the same result, within a specified boundary of what is meant by ‘the same’. This test would not have shown that A and B will always produce the same result as each other, but simply that they can sometimes produce the same result. If several tests are run, in different contexts, and A and B keep producing similar results, then researchers will start to form a view that A and B will generally produce similar results.