Bayesian Statistics – Conclusion

People groan whenever I bring up statistics in relation to marketing theory.  In reality, though, most marketing decisions are made based on numbers.  Without some level of smart statistical analysis, you can’t make an informed decision based on your data and all of those research dollars are wasted.

Given, the case study I posted yesterday was a relatively simple case.  Three products across three demographics.  I simplified things further by assuming each person in each segment rated each product.  In reality, you can never truly know this.  Say, for example, you have a segment of 1,000 people.  In this group, you have 1,000 ratings for Product A, 700 ratings for Product B, and 300 ratings for Product C.  This is the instance in which you’d use a Bayesian weighting filter.  For all we know, 200 of the people who rated Product C also rated Product B and the other 100 only rated Product A.

So the groups we’re looking at share some overlap, but there’s no easy way to know where that overlap ends and how pervasive it is in our model.  Rather than taking a simple average of our numbers, we use Bayesian statistics to normalize our samples.

Revising an example

Here’s the table of original ratings from yesterday’s case study.  Remember, our segment sizes: Segment 1 = 5,000 people, Segment 2 = 25,000 people, and Segment 3 = 1,000 people:

Product A Product B Product C
Segment 1 4.2 3.5 1.0
Segment 2 3.7 4.0 4.8
Segment 3 2.4 4.2 2.1

Last time, we made the assumption that, in any particular segment, the same number of people rated each product.  Let’s throw that assumption away and use the following information.  This table represents the total number of consumer ratings for each product by segment:

Product A Product B Product C
Segment 1 4,825 2,100 4,000
Segment 2 21,200 15,900 7,525
Segment 3 725 350 980

Using these numbers, we can calculate the Bayesian average for each product rating by segment.  Remember, we’re normalizing our data across each segment first.  Then we’ll use a weighted average to find each individual product rating:

Product A Product B Product C
Segment 1 3.30 2.92 2.12
Segment 2 4.02 4.12 4.26
Segment 3 2.77 3.09 2.64

This gives us the following weighted average product ratings: Product A – 3.86, Product B – 3.89, and Product C – 3.86.  You can see how much the numbers have changed by not assuming uniformity among the market segments.  Yesterday, we thought Product B was the lowest rated of our lines.  Considering we didn’t have all the information, this was the best analysis we could perform.  Armed with deeper knowledge of the market’s responses, though, we now know that Product B is our highest rated product, though Products A and C aren’t that far behind.

If this case study were real, you’d have to make some hard decisions at this point.  Do we cut Product A or Product C?  Considering they’re rated equally amongst the market, we’d be forced to turn to the opinion of our largest market segment.  Based on the opinions of Segment 2’s 25,000 people, we’d most likely elect to cut Product A due to its somewhat lower rating in comparison with our other two products.

Conclusion

Statistics can be a powerful tool in the hands of a savvy marketer.  They can help you make decisions based on slim data sets or help you model the future behavior of a yet un-tapped market.  On one hand you can think of statistics as your crystal ball for the market.

But always remember that, in the wrong hands, statistics can be worthless.  Just because you can take an average of a bunch of figures doesn’t mean that the number will tell you anything.  Don’t let yourself be conned by fancy words or shiny graphs – always make sure the numbers you’re looking at are relevant to the problem at hand, the decision you have to make, and the market you’re studying.