Market basket analysis

From Wikipedia, the free encyclopedia

Market Basket Analysis (MBA) applies association rule learning to purchase data with the goal of identifying cross-selling opportunities. Given a data set, the algorithm trains and identifies product baskets and product association rules. Product baskets (referred to as item sets) are groups of products purchased together at checkout. Product association rules predict the purchase of one or more other products (the consequent) given the known presense of some products in a basket (the antecedent).

Contents

[edit] Example

Consider sitting in an English pub and buying a pint of beer but not a bar meal. While servicing the request, the barkeep asks if you are interested in a bag of chips as well. Why would the keep ask such a question? Because it is the goal of the keep, in some regards, to be profitable and maximize the amount of revenue per transaction. By asking if you wanted chips, the barkeep may make a bigger tip or the bar may make more revenue. The barkeep knew to ask you this question, and knew there was a good chance (a high probability) that you would also take the chips. The barkeep had this knowledge from experience, specifically from previous interactions with customers.

Similarly, the association rule finding algorithm is trained on historical data, i.e. past transactions. The data contains checkout information and a list of products that were purchased in each transaction, perhaps along with other information (volume, sale amount, although in many cases just the presence or absence of a product in a transaction is sufficient). While training, the algorithm may identify a relationship (a form of an association) between beer and no bar meals, and predict you are more likely to buy crisps (US. chips) over someone not identified with that relationship.

Typically the relationship will be in the form of a rule such as:

IF {beer, no bar meal} THEN {crisps}

The probability that a customer will buy beer without a bar meal (i.e. that the antecedent is true) is referred to as the support for the rule. The conditional probability that a customer will purchase crisps is referred to as the confidence of the rule.

[edit] Usage

In retailing, most purchases are bought on impulse according to models of consumer behavior[1]. Market basket analysis gives clues as to what a customer might have bought if the idea had occurred to them.

Market basket analysis can be used as a first step in deciding the location and promotion of goods inside a store. If, as has been observed, purchasers of Barbie dolls have are more likely to buy candy, then high-margin candy can be placed near to the Barbie doll display. Customers who would have bought candy with their Barbie dolls had they thought of it will now be suitably tempted. This, however, is only the first level of analysis.

[edit] Challenges

The algorithms for performing market basket analysis are fairly straightforward[2]. The complexities mainly arise in exploiting taxonomies, avoiding combinatorial explosions (a supermarket may stock 10,000 or more line items), and dealing with the large amounts of transaction data that may be available.

A major difficulty is that a large number of the rules found may be trivial for anyone familiar with the business. Although the volume of data has been reduced, we are still asking the user to find a needle in a haystack. Requiring rules to have a high minimum support level and a high confidence level risks missing any exploitable result we might have found. One partial solution to this problem is differential market basket analysis.

[edit] Differential market basket analysis

Differential market basket analysis can find interesting results and can also eliminate the problem of a potentially high volume of trivial results.

Differential analysis compares results between different stores, between customers in different demographic groups, between different days of the week, different seasons of the year, etc. If the results show that a rule holds in one store, but not in any other (or does not hold in one store, but holds in all others), then we can infer that there is something interesting about that store. Perhaps its clientele are different, or perhaps it has organized its displays in a novel and more lucrative way. Investigating such differences, via data mining or other methods, may yield useful insights which will improve company sales.

[edit] Other application areas

Although market basket analysis conjures up pictures of shopping carts and supermarket shoppers, it is important to realize that there are many other areas in which it can be applied. These include:

  • Analysis of credit card purchases.
  • Analysis of telephone calling patterns.
  • Identification of fraudulent medical insurance claims.
  • Analysis of telecom service purchases.

Despite the terminology, there is no requirement for all the items to be purchased at the same time. Algorithms can be adapted to look at a sequence of purchases (or events) spread out over time. Predictive market basket analysis can be used to identify sets of item purchases (or events) that generally occur in sequence, which is something of interest to direct marketers, criminologists, and others.

[edit] References

  1. ^ Underhill, Paco. Why We Buy: The Science of Shopping.)
  2. ^ Berry and Linhoff - a reasonable introductory resource