Jun 172014
 

Most samples used by market research are in some sense the ‘wrong’ sample. They are the wrong sample because of one or more of the following:

  • They miss people who don’t have access to the internet.
  • They miss people who don’t have a smartphone.
  • Not representing the 80%, 90%, or 99% who decline to take part.
  • They miss busy people.
Samples that suffer these problems include:
  • Central location miss the people who don’t come into central locations.
  • Face-to-face, door-to-door struggles with people who tend not to be home or who do not open the door to unknown visitors.
  • RDD/telephone misses people who decline to be involved.
  • Online access panels miss the 95%+ who are not members of panels.
  • RIWI and Google Consumer Surveys – misses the people who decline to be involved, and under-represents people who use the internet less.
  • Mobile research – typically misses people who do not have a modern phone and who do not have a reliable internet package/connection.

But, it usually works!

If we look at what AAPOR call non-probability samples with an academic eye we might expect the research to usually be ‘wrong’. In this case ‘wrong’ means gives misleading or harmful advice. Similarly, ‘right’ means gives information that supports a better business decision.

The reason that market research is a $40 Billion industry is that its customers m(e.g. markets, brand managers, ec) have found it is ‘right’ most of the time. Which begs the question “How can market research usually work when the sample is usually ‘wrong’?”

There are two key reasons why the wrong sample gives the right answer and these are:

  1. Homogeneity
  2. Modelling

Homogeneity
If different groups of people believe the same thing, or do the same thing, it does not matter, very much, who is researched. As an experiment look at your last few projects and look at the data split by region, split by age, or split by gender. In most cases you will see there are differences between the groups, often differences big enough to measure, but in most cases the differences are not big enough to change the message.

The reason there are so often few important differences is that we are all more similar to each other than we like to think. This is homogeneity. The level of homogeneity increases if we filter by behaviour. For example, if we screen a sample so that they are all buyers of branded breakfast cereal, they are instantly more similar (in most cases) than the wider population. If we then ask this group to rank 5 pack designs, there will usually be no major differences by age, gender, location etc (I will come back to this use of the word usually later).

In commercial market research, our ‘wrong’ samples usually make some effort to reflect target population, we match their demographics to the population, we screen them by interest (for example heavy, medium, and light users of the target brand). The result of this is that surprisingly often, an online access panel, or a Google Consumer Surveys test will produce useful and informative answers.

The key issue is usually not whether the sample is representative in a statistical sense, because it usually isn’t, the question should be whether it is a good proxy.

Modelling
The second way that market researchers make their results useful is modelling. If a researcher finds that their data source (let’s assume it is an online access panel) over predicts purchase, they can down weight their predictions, if they find their election predictions understate a specific party they can up weight the results. This requires having lots of cases and assumes that something that worked in the past will work in the future.

So, what’s the problem?

The problem for market research is that there is no established body of knowledge or science to work out when the ‘wrong’ sample will give the right answer, and when it will give the ‘wrong’ answer. Some of the cases where the wrong sample gave the wrong answer include:

  • 1936 US presidential election, a sample of 2 million people failed to predict Roosevelt would beet Landon.
  • In 2012 Google Consumer Survey massively over-estimated the number of people who edit Wikipedia – perhaps by as much as 100% – see Jeffrey Hennings review of this case.

My belief is that market researchers need to get over the sampling issue, by recognising the problems and by seeking to identify when the wrong sample is safe, when it is not safe, and how to make it safer.

When and why does the right sample give the wrong answer?

However, there is probably a bigger problem than the wrong sample. This problem is when we use the right sample, but we get the wrong answer. There are a wide variety of reasons, but key ones include:

  • People generally don’t know why they do things, they don’t know what they are to do in the future, but they will usually answer our questions.
  • Behaviour is contextual, for example choices are influenced by what else is on offer – research is too often either context free, or applies the wrong context, or assumes the context is consistent.
  • Behaviour is often not linear, and quite often does not follow the normal distribution – but most market research is based on means, linear regression, correlation etc.

A great example of the right sample giving the wrong answer is New Coke. The research, evidently, did not make it clear to participants that this new flavour was going to replace the old flavour, i.e. they would lose what they saw as “their Coke”.

In almost every product test conducted there are people saying they would buy it who would certainly not buy it. In almost every tracker there are people saying they have seen, or even used, products they have not seen – check out this example.

The issue that researcher need to focus on is total error, not sampling error, not total survey error, but total error. We need to focus on producing useful, helpful advice.


Jun 122013
 

Earlier this month, NewMR held its first Explode-A-Myth session (find the recordings by clicking here) and my contribution was a discussion why there is no method that is a melange of qual and quant, because the underlying paradigms are different.

Through the Q&A session at that event, and in particular a question from Betsy Leichliter, I gained a clearer understanding of the core difference between qual and quant. Betsy asked “So should the ‘qual’ or ‘quant’ labels be driven by the method of analysis, not necessarily the method of “data collection”?”. I think this question from Betsy is the best answer to the question about what is the difference between qual and quant I have seen.

Within reason, any data can be assessed quantitatively or qualitatively. Of course, there are some limits to both approaches. A very small amount of data is likely to produce findings that are hard to generalise. We can count the sales of brand X, in one store, on one day, but it is hard to draw any inferences about the world from that. Similarly, ten-thousand open-ended responses could only be assessed qualitatively with a large team, or a large amount of time.

The quantitative approach is based on an assumption that there is a ‘real’ world, which we can measure objectively (or, at least, that we can get fairly close to that ideal). The underlying beliefs are a) it is the method that provides the results (different researchers should provide the same answer if they use the same method on the same data), and b) that the researcher is discovering and reporting something that exists.

The qualitative approach, as it has developed over the past thirty years, is based (for most researchers) on a constructionist paradigm (there are several different models, but they all tend to be constructionist). The researcher does not discover truths, the researcher creates a narrative that provides useful insight into what is happening. The researcher is part of the analysis, different researchers will provide different narratives, and the value of the narrative depends on the ability of the researcher to observe what is happening, to synthesise an analysis, and to create a narrative that conveys something useful to the end client.

The key difference between quant and qual is the difference between discovering and creating, overlaid with ritual of using numbers for quant and words for qual.