Effective Classification Using Fuzzy Rough Theory

1178 Words3 Pages

INTRODUCTION
Dimension Reduction is the process of converting an n-dimensional space problem to a r-dimensional space problem where r < n. It is one of the important preprocessing steps where descriptions of the input features is high dimensional and only some of the features are relevant or significant with respect to the application. For example, in case of diagnosis of cancer, thousands of genes are collected but only some of them are useful in diagnosing the disease. Thus if other features are present then they may cause reduction in accuracy of diagnosis. Hence dimension reduction is one of the useful tools in areas of data mining, pattern recognition and machine learning and it helps to devise efficient algorithms for classification [2]. The main aim of dimension reduction is to search for the optimal set of features which will help to improve the accuracy. Two basic strategies used for this are[2]
Feature Selection Methods: These methods select some of the features from the original dataset based upon some criteria. They discard the redundant or least information carrying features. Some of the techniques in feature selection include the ones based on criteria like dependency, relevance and significance. Advantage of these methods is that they are easily computable and less costly but the disadvantage is that some of the information is lost which may result in reduction of accuracy of the classifier.
Feature Extraction Methods: These methods extract new features using the information contained in the features present in the original dataset. Some of the techniques included in this category are Principal Component Analysis, Independent Component Analysis and Linear Discriminant Analysis. Advantage of these methods is tha...

... middle of paper ...

...n., vol. 52, no.3, pp. 408426, Mar. 2011.
[6] A. Chouchoulas and Q. Shen,Rough set-aided keyword reduction for text categorization, Appl. Artif. Intell.,vol. 15, no. 9, pp. 843873,Oct.01
[7] R. Jensen and Q. Shen,Semantics-preserving dimensionality reduction:Rough and fuzzy rough-based approach, IEEE Trans. Knowl. Data Eng., vol. 16, no. 12, pp.14571471, Dec. 2004.
[8] Q. Hu, D. Yu, J. Liu, and C.Wu,Neighborhood rough set based heterogeneous feature subset selection, Inf. Sci., vol. 178, no. 18, pp. 3577-3594, Sep. 2008.
[9] H. Peng, F. Long, and C. Ding,Feature selection based on mutual information criteria of max-dependency, maxrelevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 8, pp. 1226-1238,Aug.2005
[10] http://datam.i2r.a-star.edu.sg/datasets/krbd/
[11] http://archive.ics.uci.edu/ml/
[12] http://www.cs.waikato.ac.nz/ml/weka/

Open Document