In data mining, association rules are useful for analyzing and predicting customer behavior. They play an important part in shopping-basket data analysis, product clustering, catalog design, and store lay'out. For example, if a grocery store wished to sell more 1-liter bottles of Coke, they could examine transactions by customers who bought 1-liter bottles of Coke. By data mining associations, they might discover that customers who bought 1-liter bottles of Coke often bought 15-ounce bags of Lay’s Classic Potato Chips. Once this association is understood, the store could send out coupons for Lays Classic Potato Chips and offer a sale on potato chips.They could, then, be relatively certain that the sales of Coke would increase.
While the use of data mining and association rules in criminal justice has nothing to do with selling either Coke or potato chips, the goal of crime data analysis in law enforcement is to identify and visualize associations among criminal networks. For example, findings that 80% of individuals released from a particular prison were subsequently involved in automobile thefts within six months of leaving that prison would be valuable information for the police to have. Similarly, if a crime analyst determined that there was a strong association between young white men in their twenties who applied to purchase a handgun from a national chain store and rejection of that transaction with a subsequent murder, this information could lead to preventive efforts that might save lives.
The Associate Wizard (developed by Microsoft) is one example of software that helps a crime analyst create a data mining model using the Microsoft Association Rules algorithm. Such mining models are particularly useful for creating recommendation systems. How this works is that the Microsoft Association Rules algorithm scans a data set comprised of transactions or events, and finds the combinations that frequently appear together.There can be many thousands of combinations, but the algorithm can be customized to find more or fewer, and to retain only the most probable combinations.
This kind of association analysis can potentially be used to address several problems, including predicting who is likely to commit certain crimes and when these crimes might take place. Association rule mining can be used to generate rules from a crime data set based on the frequent occurrence of patterns to help lead to recommendations for preventive action. But, in general, discovering association rules helps investigators to recognize mutual implications among criminal occurrences.
Sequential pattern mining is a data mining approach that is concerned with finding statistically relevant patterns between data examples where the values are delivered in a sequence.There are several key traditional computational problems addressed within this field.These include building efficient databases and indexes for sequence information, extracting the frequently occurring patterns, comparing sequences for similarity, and recovering missing sequence members. In general, sequence mining problems can be classified as string mining, which is typically based on string processing algorithms, and itemset mining, which is typically based on association rule learning. String mining has to do with understanding the sequence in a data set, identifying individual regions or structural units within each sequence, and then assigning a function to each structural unit. Itemset mining is used for discovering regularities between frequently co-occurring items in large transactions. For example, by analyzing the records of parolees, a rule can be produced that reads, “If a parolee finds a full-time job within one month of being released from prison, he or she is likely to keep his/her appointments with his/her parole officer.”
Frequent sequence mining is used to discover a set of patterns shared among objects that have a specific order between them. For instance, a retail shop may possess a transaction database that specifies which products were acquired by each customer over time. In this case, the store may use frequent sequence mining to find that 40% of its customers who bought the first volume of Lord of the Rings came back to buy the second volume a month later. This kind of information may be used to support directed advertising campaigns or recommendation systems. In criminal justice, frequent sequence mining could help to determine the interval between certain types of crimes and the sequence of criminal offenses by offenders involved in burglary. (The question to be asked and, it is hoped, answered by the crime analyst might be, after a home break-in during which a violent crime occurred, what other crimes occurred, in what sequence, and with what time lapse between those offenses?) In effect, a huge number of possible sequential patterns are hidden in databases, and it is the job of crime analysts to mine those sequential patterns.