The Information Age: Data Mining

Good Essays
Chapter 1

1.1 Background
In the information age, a lot of data is generated from everywhere. Together with the incoming of information technology tools, so all the data are collected and waiting to be converted to information and knowledge. Therefore, the information industry provides useful information to many areas such as market analysis, science, decision-making and customer relationship. Data mining is the integration between analytical techniques and database system. Previously, it has only database query, data processing or transactional processing, which is insufficient for users to understand the whole data at a time. They cannot answer complex questions such as what are the relationships among items in database. The answers of those questions are more valuable for people. The users need is far exceed database management system ability because of a huge amount of data, so hidden patterns and knowledge should be discovered. Unfortunately, a human ability is limited and people cannot understand a very big dataset by themselves. Thus, the powerful tools are invented to help people to analyze large data. If there are no powerful tools then the huge amounts of data is just pieces of garbage because nobody would like to investigate them. In order to discover hidden patterns or useful information from tremendous data there is a process called “Data mining”.
In the database, there are associations when many items are presented at the same time. The relationships between items could represent some interesting findings. For example, the items, which are purchased together, could represent customer’s behavior and the patients who have flu and fever should have cough. Therefore, the information, which is derived from re...

... middle of paper ...

...he products in the store, but also talk about the events in some situations. I only focus on the products side how market basket analysis is implemented in retail store databases. On the other hand, data mining is also the broad area. It is the process to extract useful information, which are correlations and patterns, from the huge dataset. The result of that could answer business questions, which are usually, time consuming to resolve. I only talk about the algorithm, which is related to the market basket analysis especially Apriori algorithm. In addition, there are tools, which are analytical tools for data mining. This research would talk only about the Weka software as a tool to analyze sale data from retail store. I discuss how to use Weka with sale data to find useful information for business, also how to interpret the result from Weka for business purpose.
Get Access