What´s Data Mining?

1153 Words3 Pages

CHAPTER 1 INTRODUCTION The explosive growth in the amount of data and the challenges for finding interesting patterns from huge amount of data lead to emergence of data mining. Data mining is the process of extracting the interesting (valid, novel, useful and understandable) patterns from the huge data that are actionable and may be used for enterprise’s decision making process. Data mining is one of the core processes of knowledge discovery in databases. The basic types of data mining techniques are association rules, classification and clustering, web mining and sequential pattern mining. Association rule mining is one of the basic and most important data mining techniques. It extracts the interesting correlations, frequent patterns, associations among the item sets that may be used in decision making process. For example in case of grocery store, the association rules can be set of items that are brought together by the customer. For example “30% of the people who buy noodles also buy tomato ketchup”. This pattern can be helpful for developing marketing strategies and advertisement plan. Association rules can be helpful in areas such as market and risk management, customer segmentation, finance, telecommunication networks, intrusion detection, web usage mining and bioinformatics. Today, business enterprises store large amount of data from their daily operations such data is mainly transaction database. Finding all interesting association rule from large database is quite challenging. The most of the current approaches require multiple database scans and are very expensive. The goal is to build efficient approach that require lesser space and has lesser computation overheads. . CHAPTER 2 PROBLEM STATEMENT Consi... ... middle of paper ... ...date itemsets that does not expects to be large, thus avoiding unnecessary effort to count these itemsets. The AIS algorithm requires more and takes more effort for candidate set generation which are further reduced. Along with this main drawback, also over the database it requires too many passes. 3.1.2 Apriori Algorithm: Apriori algorithm was given which was improved AIS by Agrawal et al[2]. FP- growth algorithm initially scans the transaction database to get the frequencies of the items (or the support of the single item). The items whose frequency is less than the given minimum support are discarded from the transactions. Also in each transaction, the items are sorted in descending order according to their frequency in the database. The descending order leads to the shorter execution time rather than ascending or random order.

Open Document