Aug 262014
 
No More Surveys

Back in March 2010, I caused quite a stir with a prediction, at the UK’s MRS Conference, when I said that in 20 years we would not be conducting market research surveys. I followed my conference contribution with a more nuanced description of my prediction on my blog.

At the time the fuss was mostly from people rejecting my prediction. More recently there have been people saying the MR industry is too fixated on surveys, and my predictions are thought by some to be too cautious. So, here is my updated view on why I think we won’t be conducting ‘surveys’ in 2034.

What did I say in 2010?
The first thing I did was clarify what I meant by market research surveys:

  • I was talking about questionnaires that lasted ten minutes or more.
  • I excluded large parts of social research; some parts of which I think will continue to use questionnaires.

Why no more surveys?
In essence there are three key reasons that I think surveys will disappear

  1. The decline in response rates means that most survey research is being conducted with an ever smaller proportion of the population, who are taking very large numbers of surveys (in many cases several per week). This raises a growing number of concerns that the research is going to become increasingly unrepresentative.
  2. There are a growing number of areas where researchers feel that survey responses are poor indicators of true feelings, beliefs, priorities, and intentions.
  3. There are a growing number of options that can, in some cases, provide information that is faster, better, cheaper – or some combination of all three. Examples of these options include: passive data, big data, neuro-stuff, biometrics, micro-surveys, text processing of open-ended questions and comments, communities, and social media monitoring.

Surveys are the most important thing in market research!
There is a paradox, in market research, about surveys, and this paradox is highlighted by the following statements both being true:

  1. The most important data collection method in market research is surveys (this is because over half of all research conducted, in terms of dollars spent) is conducted via surveys.
  2. The most important change in market research data collection is the move away from surveys.
Because surveys are currently so important to market research there is a vast amount of work going on to improve them, so that they can continue to deliver value, even whilst their share of MR declines. The steps being taken to improve the efficiency and efficacy of surveys include:
  • Mobile surveys
  • Device agnostic surveys
  • Chunking the survey into modules
  • Implicit association
  • Eye-tracking
  • Gamification
  • Behavioural economics
  • Biometrics
  • In the moment research
  • Plus a vast rage of initiatives to merge other data, such as passive data, with surveys.

How quickly will surveys disappear?
When assessing how quickly something will disappear we need to assess where it is now and how quickly it could change.

It is hard to know exactly how many surveys are being conducted, especially with the growth of DIY options. So, as a proxy I have taken ESOMAR’s figures on market research spend.

The table below shows the proportion of global, total market research spend that is allocated to: Quant via surveys, Quant via other routes (e.g. people meters, traffic, passive data etc), Qual, and Other (including secondary data, consultancy and some proportion of communities).

The first three rows show the data reported in the ESOMAR Global Market Research reports. Each year reflects the previous year’s data. The data show that surveys grew as a proportion of research from 2007 to 2010. This was despite a reduction in the cost of surveys as F2F and CATI moved to online. From 2010 to 2013 there was indeed a drop in the proportion of all research spend that was devoted to surveys. However, given the falling cost of surveys and the continued growth of DIY, it is likely that the absolute number of surveys may have grown from 2010 to 2013.

Other quant, which covers many of the things that we think will replace surveys, fell from 2007 to 2010. In many cases this was because passive collection techniques became much cheaper. For example the shift from expensive services to Google Analytics.

The numbers in red are my guess as to what will happen over the next few years. My guess best on 35 years in the industry, talking to the key players, and applying what I see around me.

I think surveys could lose 9 percentage points in 3 years – which is a massive change. Does anybody seriously think it will be much faster? If surveys lose 9 percentage points they will fall below 50% of all research, but still be the largest single method.

I am also forecasting that they will fall another 11 percentage points by 2019 – trends often accelerate – but again, does anybody really think it will be faster? If that forecast is true, by 2019 about one-third of paid for research will still be using surveys. Other quant will be bigger than surveys, but will not be a single approach; there will be many forms of non-survey research.

I also think that Other (which will increasingly mean communities and integrated approaches) and qual will both grow.

What do you think?
OK, I have nailed my flag to the mast, what do you think about this issue? Are my forecasts too high, about right, or too low? Do you agree that the single most important thing about existing data collection methods is the survey process? And, that the most important change is the movement away from surveys?


 

Jun 212014
 
Nissam Small Car

A very large part of market research is based on asking people questions, for example in surveys, focus groups, depth interviews, and online discussions. In general, people are very willing to answer our questions, but the problem is that they will do it even when they can’t give us the right answer.

At IIeX last week Jan Hofmeyr shared the results of some research where respondents had been asked about which brand they buy most often and he compared it to their last 3 and last 6 purchases from audit data. He found that in the last 3 purchases 68% of people had not bought the product they claimed to buy ‘most often’, and in the last 6 purchases 58% of people had not bought their ‘most often’ brand.

The video below is designed for entertainment, but it illustrates the bogus answer problem really well:

There are two key reasons why asking questions can produce bogus answers:

  1. Social desirability bias. People are inclined to try to show themselves in the best possible light. Ask them how often they clean their teeth and they are going to want to give an answer that makes them look good, or at least does not imply they are lazy or dirty. In the video, many of the people know that music fans are supposed to know about music, so they don’t want to appear dumb.
     
  2. Being a poor witness to our own motivations and actions. Writers like Daniel Kahneman, Dan Ariely, and Mark Earls, have written about how people tend to be unaware of how they make decisions. Some of the people in the video are being primed in the question to assume that they know about the brand may possibly be deceived by their own thought processes, with what they do know being used a s pattern generator to produce plausible thoughts.

Of course, in addition to these two reasons, some people simply lie – but in my experience that is a tiny proportion (when seeking the views of customers and the general public) compared with the two reasons listed above. However, the problem of conscious lies increases if incentives are offered.

One way to reduce the number of false answers is to make it much easier for people to not answer a question, ideally by not having to say “I don’t know”, and letting people guide you to the strength of their answer. Look at the video again and you will see that many of the people being interviewed are trying to signal they don’t really know about the bands, for example “I don’t know any of their music but I’ve heard from my friends that ….”. For the sake of the interview and the comedy situation the interviewer presses them into appearing to know more. In an information gathering process we should take that as a cue to back off and make it safe or even ‘wise’ to avoid going any further.

Another important step is to avoid asking questions that most people won’t ‘know’ the answer to, such as “What is the most important factor to you when selecting a grocery store?”, “How many cups of coffee will you drink next week?”, “How many units of alcohol do you drink in an average week?”.

If you’d like to know more about asking questions, check out this presentation from Pete Cape.

The problems with direct questions are one of the major reasons that market researchers are looking towards techniques that use one or more of the following:

  • • Implicit or ‘neuro ‘techniques, such as facial coding, implicit association, and voice analytics.
  • • Passive observations, i.e. recording what people actually do.
  • • In the moment research, where people give their feedback at the time of an event, not at a later date via recall.


Jun 172014
 

Most samples used by market research are in some sense the ‘wrong’ sample. They are the wrong sample because of one or more of the following:

  • They miss people who don’t have access to the internet.
  • They miss people who don’t have a smartphone.
  • Not representing the 80%, 90%, or 99% who decline to take part.
  • They miss busy people.
Samples that suffer these problems include:
  • Central location miss the people who don’t come into central locations.
  • Face-to-face, door-to-door struggles with people who tend not to be home or who do not open the door to unknown visitors.
  • RDD/telephone misses people who decline to be involved.
  • Online access panels miss the 95%+ who are not members of panels.
  • RIWI and Google Consumer Surveys – misses the people who decline to be involved, and under-represents people who use the internet less.
  • Mobile research – typically misses people who do not have a modern phone and who do not have a reliable internet package/connection.

But, it usually works!

If we look at what AAPOR call non-probability samples with an academic eye we might expect the research to usually be ‘wrong’. In this case ‘wrong’ means gives misleading or harmful advice. Similarly, ‘right’ means gives information that supports a better business decision.

The reason that market research is a $40 Billion industry is that its customers m(e.g. markets, brand managers, ec) have found it is ‘right’ most of the time. Which begs the question “How can market research usually work when the sample is usually ‘wrong’?”

There are two key reasons why the wrong sample gives the right answer and these are:

  1. Homogeneity
  2. Modelling

Homogeneity
If different groups of people believe the same thing, or do the same thing, it does not matter, very much, who is researched. As an experiment look at your last few projects and look at the data split by region, split by age, or split by gender. In most cases you will see there are differences between the groups, often differences big enough to measure, but in most cases the differences are not big enough to change the message.

The reason there are so often few important differences is that we are all more similar to each other than we like to think. This is homogeneity. The level of homogeneity increases if we filter by behaviour. For example, if we screen a sample so that they are all buyers of branded breakfast cereal, they are instantly more similar (in most cases) than the wider population. If we then ask this group to rank 5 pack designs, there will usually be no major differences by age, gender, location etc (I will come back to this use of the word usually later).

In commercial market research, our ‘wrong’ samples usually make some effort to reflect target population, we match their demographics to the population, we screen them by interest (for example heavy, medium, and light users of the target brand). The result of this is that surprisingly often, an online access panel, or a Google Consumer Surveys test will produce useful and informative answers.

The key issue is usually not whether the sample is representative in a statistical sense, because it usually isn’t, the question should be whether it is a good proxy.

Modelling
The second way that market researchers make their results useful is modelling. If a researcher finds that their data source (let’s assume it is an online access panel) over predicts purchase, they can down weight their predictions, if they find their election predictions understate a specific party they can up weight the results. This requires having lots of cases and assumes that something that worked in the past will work in the future.

So, what’s the problem?

The problem for market research is that there is no established body of knowledge or science to work out when the ‘wrong’ sample will give the right answer, and when it will give the ‘wrong’ answer. Some of the cases where the wrong sample gave the wrong answer include:

  • 1936 US presidential election, a sample of 2 million people failed to predict Roosevelt would beet Landon.
  • In 2012 Google Consumer Survey massively over-estimated the number of people who edit Wikipedia – perhaps by as much as 100% – see Jeffrey Hennings review of this case.

My belief is that market researchers need to get over the sampling issue, by recognising the problems and by seeking to identify when the wrong sample is safe, when it is not safe, and how to make it safer.

When and why does the right sample give the wrong answer?

However, there is probably a bigger problem than the wrong sample. This problem is when we use the right sample, but we get the wrong answer. There are a wide variety of reasons, but key ones include:

  • People generally don’t know why they do things, they don’t know what they are to do in the future, but they will usually answer our questions.
  • Behaviour is contextual, for example choices are influenced by what else is on offer – research is too often either context free, or applies the wrong context, or assumes the context is consistent.
  • Behaviour is often not linear, and quite often does not follow the normal distribution – but most market research is based on means, linear regression, correlation etc.

A great example of the right sample giving the wrong answer is New Coke. The research, evidently, did not make it clear to participants that this new flavour was going to replace the old flavour, i.e. they would lose what they saw as “their Coke”.

In almost every product test conducted there are people saying they would buy it who would certainly not buy it. In almost every tracker there are people saying they have seen, or even used, products they have not seen – check out this example.

The issue that researcher need to focus on is total error, not sampling error, not total survey error, but total error. We need to focus on producing useful, helpful advice.


Apr 132014
 
Shibyu At Night

OK, let’s get one thing clear from the outset; I am not saying social media mining and monitoring (the collection and automated analysis of quantitative amounts of naturally occurring text from social media) has met with no success. But, I am saying that in market research the success has been limited.

In this post I will highlight a couple of examples of success, but I will then illustrate why, IMHO, it has not had the scale of success in market research that many people had predicted, and finally share a few thoughts on where the quantitative use of social media mining and monitoring might go next.

Some successes
There have been some successes and a couple of examples are:

Assessing campaign or message break through. Measuring social media can be a great way to see if anybody is talking about a campaign or not, and of checking whether they are talking about the salient elements. However, because of some of the measurement challenges (more on these below) the measurement often ends up producing a three level result, a) very few mentions, b) plenty of mentions, c) masses of mentions. In terms of content the measures tend to be X mentions on target, or Y% of the relevant mentions were on target – which in most cases are informative, but do not produce a set of measures that have any absolute utility and usually can be tightly aligned with ROI.

An example of this use came with the launch of the iPhone 4 in 2010. Listening to SM made it clear that people had detected that the phone did not work well for some people when held in their left hand, that Apple’s message (which came across as) ‘you should be right handed’ was not going down well, and that something needed to be done. The listening could not put a figure on how many users were unhappy, nor even if users were less or more angry than non-users, but it did make it clear that something had to be done.

Identifying language, ideas, topics. By adding humans to the interpretation, many organisations have been able to identify new product ideas (the Nivea story of how it used social media listening to help create Nivea Invisible for Black and White is a great example). Other researchers, such as Annie Pettit, have shown how they have combined social media research with conventional research, to help answer problems.

Outside of market research. Other users of social media listening, such as PR and reaction marketers appear to have had great results with social media, including social media listening. One of the key reasons for that is that their focus/mission is different. PR, marketing, and sales do not need to map or understand the space, they need to find opportunities. They do not need to find all the opportunities, they do not even need to find the best opportunities, they just need to find a good supply of good opportunities. This is why the use of social media appears to be growing outside of market research, but also why its use appears to be in relative decline inside market research.

The limitations of social media monitoring and listening
The strength of social media monitoring and listening is that it can answer questions you had not asked, perhaps had not even thought of. Its weakness is that it can’t answer most of the questions that market researchers’ clients ask.

The key problems are:

  • Most people do not comment in social media, most of the comments in social media are not about our clients’ brands and services, and the comments do not typically cover the whole range of experiences (they tend to focus on the good and the bad). This leaves great holes in the information gathered.
  • It is very hard to attribute the comments to specific groups, for example to countries, regions, to users versus non-users – not to mention little things like age and gender.
  • The dynamic nature of social media means that it is very hard to compare two campaigns or activities, for example this year versus last year. The number of people using social media is changing, how they are using it is changing, and the phenomenal growth in the use of social media by marketers, PR, sales, etc is changing the balance of conversations. Without consistency, the accuracy of social media measurements is limited.
  • Most automated sentiment analysis is considered by insight clients and market researchers to either be poor or useless. This means good social media usage requires people, which tends to make it more expensive and slower, often prohibitively expensive and often too slow.
  • Social media deals with the world as it is, brands can’t use it to test ads, to test new products and services, or almost any future plan.

The future?
Social media monitoring and listening is not going to go away. Every brand should be listening to what its customers and in many cases the wider public are saying about its brands, services, and overall image. This is in addition to any conventional market research it needs to do; this aspect of social media is not a replacement for anything, it is a necessary extra.

Social media has spawned a range of new research techniques that are changing MR, such as insight communities, smartphone ethnography, social media bots, and netnography. One area of current growth is the creation of 360 degree views by linking panel and/or community members to their transactional data, passive data (e.g. from their PC and mobile device), and social media data. Combined with the ability of communities and panels to ask questions (qual and quant) this may create something much more useful that just observational data.

I expect more innovations in the future. In particular I expect to see more conversations in social media initiated by market researchers, probably utilising bots. For example, programming a bot to look out for people using words that indicate they have just bought a new smartphone and asking them to describe how they bought it, what else they considered etc – either in SM or via asking them to continue the chat privately. There are a growing number of rumours that some of the major clients are about to adopt a hybrid approach, combining nano-surveys, social media listening, integrated data, and predictive analytics, and this could be really interesting, especial in the area of tracking (e.g. brand, advertising, and customer satisfaction/experience).

I also expect two BIG technical changes that will really set the cat amongst the pigeons. I expect somebody to do a Google and introduce a really powerful, free or almost free alternative to the social media mining and monitoring platforms, and I expect one or more companies to come up with sentiment analysis solutions that are really useful. I think a really useful platform will include the ability to analyse images and videos, to follow links (many interesting tweets and shares are about the content of the link), to build a PeekYou type of database of people (to help attribute the comments), and will have much better text analytics approach.

 

Apr 072014
 

Last week Jeffrey Henning gave a great #NewMR lecture on how to improve the representativeness of online surveys (click here to access the slides and recordings). During the lecture he touched lightly on the topic of calculating sampling error from non-probability samples, pointing out that it did not really do what it was supposed to. In this blog I want to highlight why I recommend using this statistic as a measure of reliability, but not validity.

If we calculate the sampling error for a non-probability sample, for example from an online access panel, we are not representing the wider population. The population for this calculation is just those people who might have taken the survey. For example, just those members of the online access panel who met the screening criteria and who were willing (during the survey period) to take the study. The sampling error tells us how good our estimates of this population are (i.e. those members of the panel who met the criteria and who were willing to take a survey at that particular time).

If we take a sample of 1000 people from an online access panel and we calculate that the confidence interval is +/-3% at the 95% level, what we are saying is that if we had done another test, on the same day, with the same panel, with a different group of people, we are 95% sure that the answer we would have got would have been within 3% of the first test. That is a measure of reliability. But we are not saying that if we had measured the wider population the answer would have been within 3%, or 10% or any other number we could quote.

The sampling error statistic from a panel is not about validity, since we can’t estimate how representative the panel is of the wider population. But, it does give us a statistical measure of how likely we are to get the same answer again if we repeat the study on the same panel, with the same sample specification, during the same period of time – which is a pretty good statement of reliability.

Note, to researchers reliability is about whether something measures the same way each time. Validity relates to whether what is measured is correct. A metal metre ruler that is 10cm short is reliable, it is always 10 cm short, but it is not as valid as we would like.

My recommendation is to calculate the sampling error and use it to indicate which values from the non-probability sample are at least big enough to be reliable. But let’s not claim it represents the sampling error of the wider population, nor that it directly links to validity.

I would recommend adding text something like: “The sampling reliability of this estimate at the 95% level is +/- X%, which means that if we used the same sampling source 20 times, with the same specification, we would expect the answers to be within X% 19 times.”

Total Survey Error
Another reason to be careful with sampling error is that it is only one source of error in a survey. Asking leading questions, asking questions that people can’t answer (for example because we are poor witnesses to our own plans and motivations), or asking questions that people don’t want to answer (for example because of social desirability bias), can all result in much bigger problems than sampling error.

Researchers can sometimes be too worried about sampling error, leading them to ignore much bigger sources of error in their work.

 

Dec 232013
 
The material below is an excerpt from a book I am writing with Navin Williams and Sue York on Mobile Market Research, but its implications are much wider and I would love to hear people’s thoughts and suggestions.

Most commercial fields have methods of gaining and assessing insight other than market research, for example testing products against standards or legal parameters, test launching, and crowd-funding. There are also a variety of approaches that although used by market researchers are not seen by the market place as exclusively (or even in some cases predominantly) the domain of market research, such as big data, usability testing, and A/B testing.

The mobile ecosystem (e.g. telcos, handset manufacturers, app providers, mobile services, mobile advertising and marketing, mobile shopping etc) employs a wide range of these non-market research techniques, and market researchers working in the field need to be aware of the strengths and weaknesses of these approaches. Market researchers need to understand how they can use the non-market research techniques and how to use market research to complement what they offer.

The list below cover techniques frequently used in the mobile ecosystem which are either not typically offered by market researchers or which are offered by a range of other providers as well as market researchers. Key items are:

  • Usage data, for example web logs from online services and telephone usage from the telcos.
  • A/B testing.
  • Agile development.
  • Crowdsourcing, including open-source development and crowdfunding.
  • Usability testing.
  • Technology or parameter driven development.

Usage data

The mobile and online worlds leave an extensive electronic wake behind users. Accessing a website tells the website owner a large amount about the user, in terms of hardware, location, operating system, language the device is using (e.g. English, French etc), and it might make an estimate of things like age and gender based on the sites you visit and the answers you pick. Use a mobile phone and you tell the telco who you contacted, where you were geographically, how long the contact lasted, what sort of contact was it (e.g. voice or SMS). Use email, such as Gmail or Yahoo, and you tell the service provider who you contacted, which of your devices you used, and the content of your email. Use a service like RunKeeper or eBay or Facebook and you share a large amount of information about yourself and in most cases about other people too.

In many fields, market research is used to estimate usage and behaviour, but in the mobile ecosystem there is often at least one company who can see this information without using market research, and see it in much better detail. For example, a telco does not need to conduct a survey with a sample of its subscribers to find out how often they make calls or to work out how many texts they send, and how many of those texts are to international numbers. The telco has this information, for every user, without any errors.

Usage data tends to be better, cheaper, and often quicker than market research for recording what people did. It is much less powerful in working out why patterns are happening, and it is thought (by some people) to be weak in predicting what will happen if circumstances change. However, it should be noted that the advocates of big data and in particular ‘predictive analytics’ believe that it is possible to work out the answer to ‘what-if’ questions, just from usage/behaviour data.

Unique access to usage data
One limitation to the power of usage data is that in most cases only one organisation has access to a specific section of usage data. In a country with two telcos, each will only have access to the usage data for their subscribers, plus some cross-network traffic information. The owner of a website is the only company who can track the people who visit that site (* with a couple of exceptions). A bank has access to the online, mobile and other data from its customers, but not data about the users of other banks.

This unique access feature of usage data is one of the reasons why organisations buy data from other organisations and conduct market research to get a whole market picture.

* There are two exceptions to the unique access paradigm.
The first is that if users can be persuaded to download a tracking device, such as the Alexa.com toolbar, then that service will build a large, but partial picture of users of other services. This is how Alexa.com is able to estimate the traffic for the leading websites globally.

The second exception is if the service provider buys or uses a tool or service from a third party then some information is shared with that provider.

A complex and comprehensive example of this type of access is Google who sign users up to their Google services (including Android), offer web analytics to websites, and serve ads to websites, which allows them to gain a large but partial picture of online and mobile behaviour.

Legal implications of usage data
Usage data, whether it is browsing, emailing, mobile, or financial, is controlled by law in most countries, although the laws tend to vary from one jurisdiction to another. Because the scale and depth of usage data is a new phenomenon and because the tools to analyse it and the markets for selling/using it are still developing the laws are tending to lag behind the practice.

A good example, of the challenges that legislators and data owners face is determining what is permitted and what is not, are the problems that Google had in Spain and Netherlands towards the end of 2013. The Dutch Government’s Data Protection Agency ruled in November 2013 that Google had broken Dutch law by combining data together from its many services to create a holistic picture of users. Spain went one step further and fined Google 900,000 Euros for the same offence (about $1.25 million). This is unlikely to be the end of the story, the laws might change, Google might change its practices (or the permissions it collects), or the findings might be appealed. However, they illustrate that data privacy and protection are likely to create a number of challenges for data users and legislators over the next few year.

A/B testing

The definition of A/B testing is a developing and evolving one; and it is likely to evolve and expand further over the next few years. At its heart A/B testing is based on a very old principle, create a test where two offers only differ in one detail, present these two choices to matched but separate groups of people to evaluate, and whichever is the more popular is the winner. What makes modern A/B testing different from traditional research is the tendency to evaluate the options in the real market, rather than with research participants. One high profile user of A/B testing is Google, who use it to optimise their online services. Google systematically, and in many cases automatically, select a variable, offer two options, and count the performance with real users. The winning option becomes part of the system.

Google’s A/B testing is now available to users of some of its systems, such as Google Analytics. There are also a growing range of companies offering A/B testing systems. Any service that can be readily tweaked and offered is potentially suitable for A/B testing – in particular virtual or online services.

The concept of A/B testing has moved well beyond simply testing two options and assessing the winner, for example:

  • Many online advertising tools allow the advertiser to submit several variations and the platform adjusts which execution is shown most often and to whom it is shown to maximise a dependent variable, for example to maximise click through.
  • Companies like Phillips have updated their direct mailing research/practice by developing multiple offers, e.g. 32 versions of a mailer, employing design principles to allow the differences to be assessed. The mailers are used in the market place, with a proportion of the full database, to assess their performance. The results are used in two ways. 1) The winning mailer is used for the rest of the database. 2) The performance of the different elements are assessed to create predictive analytics for future mailings.
  • Dynamic pricing models are becoming increasingly common in the virtual and online world. Prices in real markets, such as stock exchanges have been based for many years on dynamic pricing, but now services such as eBay, Betfair, and Amazon apply differing types of automated price matching.
  • Algorithmic bundling and offer development. With services that are offered virtually the components can be varied to iteratively seek combinations that work better than others.

The great strength of A/B testing is in the area of small, iterative changes, allowing organisations to optimise their products, services, and campaigns. Market research’s key strength, in this area, is the ability to research bigger changes and help suggest possible changes.

Agile development

Agile development refers to operating in ways where is it easy, quick, and cheap for the organisation to change direction and to modify products and services. One consequence of agile development is that organisations can try their product or service with the market place, rather than assessing it in advance.

Market research is of particular relevance when the costs of making a product are large, or where the consequences of launching an unsatisfactory product or service are large. But, if products and services can be created easily and the consequences of failure are low, then ‘try it and see’ can be a better option than classic forms of market research. Whilst the most obvious place for agile development is in the area of virtual products and services, it is also used in more tangible markets. The move to print on demand books has reduced the barriers to entry in the book market and facilitated agile approaches. Don Tapscott in his book Wikinomics talks about the motorcycle market in China, which adopted an open-source approach to its design and manufacture of motorcycles, something which combined agile development and crowdsourcing (the next topic in this section).

Crowdsourcing

Crowdsourcing is being used in a wide variety of way by organisations, and several of these ways can be seen as an alternative to market research, or perhaps as routes that make market research less necessary. Key examples of crowdsourcing include:

  • Open source. Systems like Linux and Apache are developed collaboratively and then made freely available. The priorities for development are determined by the interaction of individuals and the community, and the success of changes is determined by a combination of peer review and market adoption.
  • Crowdfunding. One way of assessing whether an idea has a good chance of succeeding is to try and fund it through a crowdfunding platform, such as Kickstarter. The crowdfunding route can provide feedback, advocates, and money.
  • Crowdsourced product development. A great example of crowdsourcing is the T-shirt company Threadless.com. People who want to be T-shirt designers upload their designs to the website. Threadless displays these designs to the people who buy T-shirts and asks which ones people want to buy. The most popular designs are then manufactured and sold via the website. In this sort of crowdsourced model there is little need for market research as the audience get what the audience want, and the company is not paying for the designs, unless the designs prove to be successful.

Usability testing

Some market research companies offer usability testing, but there are a great many providers of this service who are not market researchers and who do not see themselves as market researchers. The field of usability testing brings together design professionals, HCI (human computer interaction), ergonomics, as well market researchers.

Usability testing for a mobile phone, or a mobile app, can include:

  • Scoring it against legal criteria to make sure it conforms to statutory requirements.
  • Scoring it against design criteria, including criteria such as disability access guidelines.
  • User lab testing, where potential users are given access to the product or service and are closely observed as they use it.
  • User testing, where potential users are given the product or given access to the service and use it for a period of time, for example two weeks. The usage may be monitored, there is often a debrief at the end of the usage period (which can be qualitative, quantitative, or both), and usage data may have been collected and analysed.

Technology or parameter driven

In some markets there are issues other than consumer choice that guide design and innovation. In areas like mobile commerce and mobile connectivity, there are legal and regulatory limits and requirements as to what can be done, so the design process will often be focused on how to maximise performance, minimise cost, whilst complying with the rules. In these situations, the guidance comes from professionals (e.g. engineers or lawyers) rather than from consumers, which reduces the role for market research.

Future innovations

This section of the chapter has looked at a wide range of approaches to gaining insight that are not strengths of market research. It is likely that this list will grow over time as technologies develop and it is likely to grow as the importance of the mobile ecosystem continues to grow.

As well as new non-market research approaches being developed it is possible, perhaps likely, that areas which are currently seen as largely or entirely the domain of market research will be shared with other non-market research companies and organisations. The growth in DIY or self-serve options in surveys, online discussions, and even whole insight communities are an indication of this direction of travel.


So, that is where the text is at the moment. Plenty of polishing still to do. But here are my questions?
  1. Do you agree with the main points?
  2. Have I missed any major issuies?
  3. Are there good examples of the points I’ve made that you could suggest highlighting/using?

Nov 242013
 

To help celebrate the Festival of NewMR we are posting a series of blogs from market research thinkers and leaders from around the globe. These posts will be from some of the most senior figures in the industry to some of the newest entrants into the research world.

A number of people have already agreed to post their thoughts, and the first will be posted later today. But, if you would like to share your thoughts, please feel free to submit a post. To submit a post, email a picture, bio, and 300 – 600 words on the theme of “Opportunities and Threats faced by Market Research” to admin@newmr.org.

Posts in this series
The following posts have been received and posted:

Nov 152013
 

London, 14 November, 2013, the ICG (the Independent Consultants Group), held their fourth Question Time event, where five leading lights of the MR industry are invited to answer questions posed by ICG members and the audience. I had the honour to be the chair of the session, and to ask the five luminaries the questions.

The five panel members were (quoting their description on the ICG site):

  • Ken Parker, AQR Chairman; founder – Discovery Research; sports research expert and football fanatic
  • Becky Rowe, MD of ESRO; an award-winning researcher for NHS ethnography work
  • Paul Edwards , Chief Strategy Officer, Hall & Partners; vastly experienced industry leader and ad planner
  • Janet Kiddle, Founder: Steel Magnolia and long-time ICG member; ex MD of TRBI
  • Mike Barnes, Consultant; ex Head of Research, RBS

As ever the session was a social success with lots of networking and discussion, including a chance for me to hear about Dinko Svetopetic’s success in promoting Insight Communities in Poland via his company MRevolution.

But, what I wanted to post here were my key takeaways from the session.

  1. Big Data is the topic of the moment. However, the general view is that Big Data is making relatively slow progress and will initially have a much bigger impact on the large agencies than on the independents and consultants. Indeed, Big Data may even be an opportunity for independents in that they can provide help on understanding the “Why?” and helping shape the “So what?”
  2. DIY is a threat to independents and consultants, but it is also an opportunity. When clients find they have bitten off more than they can chew, or when they get out of their depth, the independents and consultants are a great resource to help resolve issues.
  3. One challenge for independents (and clients) is how to stay up-to-date with the latest approaches, tools, and technology. The view of the panel was that nobody can stay fully informed about everything. The key for independents is to develops strengths, not an ever wider offering, and to support this with networks.
  4. Another threat to independents and consultants is competition from people supplying poor research, particular in the context of faster/cheaper research. The general response of the room and the panel was that independents should continue to stress the need for good research, that analysis requires experience and time, and to focus on the clients who are looking for something more than ‘value’ or bulk research. Ken Parker was also able to report back (with his AQR hat on) on the moves being made to create suitable accreditation schemes for qual research and for recruiters. The ICG is involved in this initiative, so keep your eyes on their website.
  5. The hunt is clearly still on for a better way of presenting information. Becky Rowe made the case for hiring professional communicators/designers to improve the way we communicate in MR. Mike Brown, gave the client’s perspective that presenters need to have done their homework and identified what a particular audience expects and needs – one size does not fit all.
  6. In terms of key trends independents need to be aware of, the panel identified:
    • Online qual
    • DIY
    • Big Data
    • The need to combine asking questions (in qual and quant) with observational research

For me, one of the interesting nuggets was that over two-thirds of the room had delivered at least one ‘old fashioned’ written report (i.e. more than 10 pages of words), in the last year. To me this suggests that clients who are working with independents are looking for something different than the sort of ‘fast food’ they typically buy from the agencies.

Oct 202013
 

Tens of thousands of new products are tested each year, as part of concept screening, NPD, and volumetric testing. Some products produce a positive result, and everybody is pretty happy, but many produce a negative result. A negative result might be that a product has a low stated intention to purchase or it might be that it fails to create the attitude or belief scores that were being sought.

Assuming that the research was conducted with a relatively accepted technique, what might the negative result mean?

A bad product/idea
One possibility is that the product is simply not good enough. This means that if the product is launched, as currently envisaged, it is very likely to fail. In statistical terms this is the true negative.

The false negative
The second possibility is that the result is a Type II error, i.e. a false negative. The product is good, but the test has not shown this. Designers and creatives seem to think this is the case in a large proportion of the cases, and there are many ways that this false negative result can occur.

The test was unlucky
If a test is based on a procedure with a true negative rate of 80% then one-in-five times a success will be recorded as a failure. A recent article in the New Scientist (19 October, 2013) pointed out that since most tests focus on minimising Type I errors (false positives) the typical true negative rate is often much less than 80%, meaning unlucky results will be more common.

The sample size was too small
If a stimulus produces a large effect, it will obvious even with a small sample, but if the effect is small a large sample is needed to indicate it, and if it is not indicated, it will typically be called a failure. For example, if a sample of 1000 (randomly selected) people is used, the result is normally taken to be +/- 3%, which means relatively small benefits can be identified. However, if a sample size of 100 is used the same assumptions would imply +/- 10%, which means effects have to be much larger to be likely to be found.

The description does not adequately describe the product
If the product description is not good enough, then the result of the test is going to be unreliable, which could result in a good idea getting a bad result.

The product or its use can’t be envisaged
Some products only become appealing once they are used, apps and software often fall into this category, but so do products as varied as balsamic vinegar, comfy socks, and travel cards (such as the London Oyster). Some products only become appealing when other people start to use them, as Mark Earls has shown in “I’ll have what she’s having”. Generally, copying behaviour is hard to predict from market research tests, producing a large number of false positives and false negatives. In these cases, the purchase intention scale (and alternatives such as predictive markets) can be very poor indicators of likely success.

In many cases people may be able to identify that they like a product, but are unable to reliably forecast whether they will actually buy and use it, i.e. they can’t envisage how the product will fit in their lives. For example, I have lost count of the number of holiday locations and restaurants I have been convinced that I would re-visit, only to be proved wrong. This is another situation where the researcher’s trusty purchase intention scale can be a very poor indicator.

The wrong people were researched
If the people who might buy the product were not researched, then the result of the test is unlikely to forecast their behaviour. For example, in the UK, energy drinks are less about sports people than office workers looking for a boost. Range Rovers are less for country folk than they are for Londoners.

So, how should a bad result be dealt with?
This is where science becomes art, and sometimes it will be wrong (but the science is also wrong some of the time). So, here are a few thoughts/suggestions.

  • If you expected the product/concept to fail, the test has probably told you what you already knew, so it is most likely safe to accept the negative finding.
  • If you have tested several similar products, and this is the one of the weaker results, it is probably a good indication the product is weak.

In both of these cases, the role of the modern market researcher is not just to give bad news, it should also include suggesting recommendations for what might work, either modifications to the product/concept, or alternative ideas.

If you can’t see why it failed
If you can’t see why a product failed, try to find ways of understanding why. Look at the open-ended comments to see if they provide clues. Try to assess whether the idea was communicated. For example, did people understand the benefits, and reject them, or not understand the benefits?

Is the product/concept one where people are likely to be able to envisage how the product would fit in their life? If not, you might want to suggest qualitative testing, in-home use test, or virtual reality testing.

Some additional questions
To help understand why products produce a failing score, I find it useful to include the following in screening studies:

  • • What sort of people might use this product?
  • • Why might they like it/use it?
  • • What changes would improve it for these people?

Aug 022013
 

This post has been written in response to a query I receive fairly often about sampling. The phenomenon it looks at relates to the very weird effects that can occur when a researcher uses non-interlocking quotas, effects that I am calling unintentional quotas, for example when using an online access panel.

In many studies, quota controls are used to try to achieve a sample to match a) the population and/or b) the target groups needed for analysis. Quota controls fall into two categories, interlocking and non-interlocking.

The difference between the two types can be shown with a simple example, using gender (Male and Female) and colour preference (Red or Blue). If we know that 80% of Females prefer Red, if we know that 80% of Men prefer Blue, and if there are an equal number of Males and Females in our target population, then we can create interlocking quotas. In our example we will assume that the total sample size wanted is 200.

  • Males who prefer Red = 50% * 20% * 200 = 20
  • Males who prefer Blue = 50% * 80% * 200 = 80
  • Females who prefer Red = 50% * 80% * 200 = 80
  • Females who prefer Blue = 50% * 20% * 200 = 20

These quotas deliver the 200 people required, in the correct proportions.

The Problems with Interlocking Quotas
The problem with the interlocking quotas above is that it requires the researcher to know what the colour preference of Males versus Females is, before doing the research. In everyday market research the quotas are often more complex, for example: 4 regions, 4 age breaks, 2 gender breaks, 3 income breaks. This pattern (of region, age, gender, and income) would generate 96 interlocking cells, and the researcher would need to know the population data for each of these cells. If these characteristics were then to be combined with a quota related to some topic (such as coffee drinking, car driving, TV viewing etc) then the number of cells becomes very large, and it is very unlikely the researcher would know the proportions for each cell.

Non-Interlocking Quotas
When interlocking cells become too tricky, the answer tends to be non-interlocking cells.

In our example above, we would have quotas of:

  • Male 100
  • Female 100
  • Prefer Red 100
  • Prefer Blue 100

The first strength of this route is that it does not require the researcher to know the underlying interlocking structure of the characteristics in the population. The second strength is that it makes it simple for the sample to be designed for the researcher’s need. For example, if in the population we know that Red is preferred by 80% of the population, then a researcher might still collect 100 Red and 100 Blue, to ensure the Blue sample was large enough to analyse, and the total sample could be created by weighting the results (to down-weight Blue, and up-weight Red).

Unintentional Interlocking Quotas
However, non-interlocking quotas can have some very weird and unpleasant effects if there are differences in response rates in the sample. This is best shown by an example.

Let’s make the following assumptions about the population for this example:

  • Prefer Red 80%
  • Prefer Blue 20%
  • No differences in colour gender preferences, i.e. 80% of males and females prefer Red
  • Female response rate 20%
  • Male response rate 10%

The researcher knows that overall 80% of people prefer Red, but does not know what the figures are for males and females, indeed the researcher hopes this project will through some light on any differences.

The specification of the study is to collect 200 interviews, using the following non-interlocking quotas.

  • Male 100
  • Female 100
  • Prefer Red 100
  • Prefer Blue 100

A largish initial sample of respondents are invited, let’s assume 1000 males and 1000 females. Noting that 1000 males at 10% response rate should deliver 100 completes.

However!!!
After 125 completes have been achieved the pattern of completed interview looks like this:

  • Female Red 67
  • Female Blue 17
  • Male Red 33
  • Male Blue 8

This is because the probability of each of the 125 interviews can be estimated by the combination of the chance it is male or female (10% male response rate and 20% female means that it is one-third likely to be a male and two-thirds likely to be a female) and the preference for Red (80%) and Blue 20%). Which to the nearest round percentages gives us the following odds: Female Red 53%, Female Blue 13%, Male Red 27%, Male Blue 7%.

The significance of 125 completes is that the Red Quota is complete. No more Reds can be collected. This, in turn, means:

  • The remaining 75 completes will all be people who prefer Blue
  • 17 of the remaining interviews will be Female (we already have 83 Females, so the Female quota will close when we have another 17)
  • 58 of the remaining interviews will be Male, Male Blues will be the only missing cell left to fill
  • The rapid filling of the Red quota, especially with Females, has resulted in interlocking quotas being created for the Blue cells.

The final result from this study will be:

  • Female Red 67
  • Female Blue 33
  • Male Red 33
  • Male Blue 67

Although there is no gender bias to colour preference in the population, in our study we have created a situation where two-thirds of Males prefer Blue, and two-thirds of the Females prefer Red.

In this example we are going to have to invite a lot more Males. We started by inviting 1000 Males, and with a response rate of 10% we might expect to collect our 100 completes. But, we have ended up needing to collect 67 Male Blues, because of the unintentional interlocking quotas. We can work out the number of invites it takes to collect 67 Male Blues by dividing 67 by the product of the response rate (10%) and the incidence of preferring Blue (20%), which gives us 67 / (10% * 20%) = 3,350. The 1000 male invites need to be boosted, by another 2,350, to 3,350 to fill the cells. Most researchers will have noticed that the last few cells in a project are hard to fill, that is because they have created unintentional interlocking quotas, locking the hardest cells together, which makes them even harder.

This, of course, is a very simple example. We only have two variables, each with two levels, and the only varying factor is the response rates between Male and Female. In an everyday project we would have more variables, and response rates will often vary by age, gender, and region. So, the scale of the problem in typical interlocking samples is likely to be larger than in this example, at least for the harder cells to complete.

Improving the Sampling/Quota Controlling Process
Once we realise we have a problem, and with the right information, there is plenty we can do to remove or ameliorate the problem.

  • Match the invites to the response rates. If, in the example above, we had invited twice as many Males as Females the cells would have completed perfectly.
  • Use interlocking cells. To do this you might run an omnibus before the main survey to determine what the cells targets should be.
  • Use the first part of the data collection to inform the process. So, in the example above we could have set the quotas to 50 for each of the four cells. As soon as one cell fills we look at the distribution of the data and amend the structure of the quotas, making some of them interlocking, perhaps relaxing (i.e. make bigger) some of the others, and invite more of the sorts of people we are missing. This does not fix the problem, but it can greatly reduce it, especially if you bite the bullet and increase the sample size at your expense.

Working with panel companies. Tell the panel company that you want them to phase their invites to match likely response rates. They will know which demographics respond better. For the demographic cells, watch to see that they are advancing in step. For example, watch to see that Young Males, Young Females, Older Males, and Older Females are all filling at the same rate and shout if this is not happening.

It is a good idea to make sure that the fieldwork is not going to happen so fast that you won’t have time to review it and make adjustments. As a rule of thumb, you want to review the data when one of the cells is about 50% full. At that stage you can do something about it. This means you do not want the survey to start after you leave the office, if there is a risk of 50% of the data being collected before the start of the next day.


Questions? Is this a problem you have comes across? Do you have other suggestions for dealing with it?