Predictive Analysis Advantages And Disadvantages

1671 Words4 Pages

Predictive Analysis
Predictive analysis can be defined as an exercise of extracting the information from existing data in order to predict something important about future.
Different predictive models and analysis are used to predict future which can be applied to different business to analyze something about current data and historical facts in order for getting better understanding about customers, products and partners and to identify possible risks and opportunities. It uses a number of techniques, including data mining, statistical modeling and machine learning to help analysts make future business forecasts.
Decision Trees
One approach for developing predictive classification model is a decision and classification tree, which represents …show more content…

The major limitations include:
• Inadequacy in applying regression and predicting continuous values
• Possibility of spurious relationships
• Unsuitability for estimation of tasks to predict values of a continuous attribute
• Difficulty in representing functions such as parity or exponential size
• Possibility of duplication with the same sub-tree on different paths
• Limited to one output per attribute, and inability to represent tests that refer to two or more different objects

Induction of Decision Trees
The implementation of the decision tree involves a data structure consisting of nodes and edges (or links), in which one node is identified as a parent to other nodes (the children) that are connected via the edges. When traversing the tree, each determination as to which path to take from any specific node is dependent on the answer to the node’s question. At each step along the path from the root of the tree to the leaves, the set of records that conform to the answers along the way continues to grow …show more content…

It uses gain ratio as splitting criteria. The splitting ceases when the number of instances to be split is below a certain threshold. Error–based pruning is performed after the growing phase. C4.5 can handle numeric attributes. It can induce from a training set that incorporates missing values by using corrected gain ratio criteria.

CART (Classification and Regression Tree)
CART is characterized by the fact that it constructs binary trees, namely each internal node has exactly two outgoing edges. The splits are selected using the twoing criteria and the obtained tree is pruned by cost–complexity Pruning.
When provided, CART can consider misclassification costs in the tree induction. It also enables users to provide prior probability distribution. An important feature of CART is its ability to generate regression trees. Regression trees are trees where their leaves predict a real number and not a class. In case of regression, CART looks for splits that minimize the prediction squared error (the least–squared deviation). The prediction in each leaf is based on the weighted mean for node.

CHAID (CHi-squared Automatic Interaction Detector)
Performs multi-level splits when computing classification

More about Predictive Analysis Advantages And Disadvantages

Open Document