Recent Trends in Document Clustering with Evolutionary-Based Algorithms Essay examples

:: 80 Works Cited
Length: 742 words (2.1 double-spaced pages)
Rating: Yellow      
Open Document
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Document clustering is the process of organizing a particular electronic corpus of documents into subgroups of similar text features. Previously, a number of statistical algorithms had been applied to perform clustering to the data including the text documents. There are recent endeavors to enhance the performance of the clustering with the optimization based algorithms such as the evolutionary algorithms. Thus, document clustering with evolutionary algorithms became an emerging topic that gained more attention in the recent years. This paper presents an up-to-date review fully devoted to evolutionary algorithms designed for document clustering. Its firstly provides comprehensive inspection to the document clustering model revealing its various components and related concepts. Then it shows and analyzes the principle research work in this topic. Finally, it brings together and classifies various objective functions from the collection of research papers. The paper ends up by addressing some important issues and challenges that can be subject of future work.

The objective function (or fitness function) is the measure that evaluates the optimality of the generated evolutionary algorithm solutions in the search space. In clustering domain, the fitness function refers to the adequacy of the partitioning. Accordingly, it needs to be formulated carefully, taken into consideration that the clustering is an unsupervised process.
Different objective functions generate different solutions even form the same evolutionary algorithm. Presuming also that the fitness could either be a minimization or a maximization function. Moreover, the algorithm could be formulated with one or with multi objective functions. To sum up, "choosing optimizati...

... middle of paper ...

...traction. 1999.
76. Turney, P.D., Learning algorithms for keyphrase extraction. Information Retrieval, 2000. 2(4): p. 303-336.
77. Wu, J.-l. and A.M. Agogino, Automating Keyphrase Extraction with Multi-Objective Genetic Algorithms. Proceedings of the Hawaii International Conference on System Science, HICSS 2003, 2003.
78. Sathya, A.S. and B.P. Simon, A document retrieval system with combination terms using genetic algorithm. International Journal of Computer and Electrical Engineering, 2010. 2(1): p. 1-6.
79. Dorfer, V., et al. Optimization of keyword grouping in biomedical information retrieval using evolutionary algorithms. 2010.
80. Dorfer, V., et al., On the performance of evolutionary algorithms in biomedical keyword clustering, in Proceedings of the 13th annual conference companion on Genetic and evolutionary computation2011, ACM: Dublin, Ireland. p. 511-518.

Click the button above to view the complete essay, speech, term paper, or research paper

This essay is 100% guaranteed.

Title Length Color Rating  
Essay about The Effects of Words Clustering on Memory - As world spin and time pass by, we learn a lot of new things every day. Some we may remember immediately but some are impossible to do it. Thus we as a human try a lot of ways to make every day life easier. Word clustering is one of it. Clustering is meant “similar” or “same”. The way we use it is, when we a given a thing to do, we will categorize same thing that have similar or same pattern or any common pattern. This word clustering seems to give a different impact on every people. We have two types of memory, long term memory and short term memory....   [tags: Words Clustering] 2253 words
(6.4 pages)
Strong Essays [preview]
CILOP: Clustering Based Method for Class Imbalance Learning Using Optics - CHAPTER 3 CILOP: CLUSTERING BASED METHOD FOR CLASS IMBALANCE LEARNING USING OPTICS The present chapter proposes a novel approach that is Clustering based Class Imbalance Learning using OPTICS [49] method for improvement of class imbalance learning. The content of this chapter is published in “International Journal of Computer Applications”, Page No: 33-42, Volume 51-No 16, August 2012, K.Nageswara Rao , T.Venkateawara Rao and D.Rajya Lakshmi, “A Novel Class Imbalance Learning using Ordering Points Clustering”....   [tags: database, framework, algorithm, subset] 2687 words
(7.7 pages)
Powerful Essays [preview]
Clustering in Financial Services: A Literrature Review Essay - The theory of cluster has became one of theory that considered important in the regional economic development theoryin the recent worlds. It suggest that the co-ocation or geographic proximity results for firms that do clustering will be crucial to increase the economics scope of the firms, mainly due to lower input cost that are resulted from agglomeration economies and facilitates knowledge spillovers which can increase the firms’ productivity (Wolman and Hincapie, 2010, p1). Therefore, this can creates more competitive firms that do the clustering, and also produce significant growth for the firms....   [tags: markusen, clustering, financial services]
:: 12 Works Cited
2165 words
(6.2 pages)
Term Papers [preview]
Clustering: Keeping Malware Out in Android Applications Essay - ... These les ask for permission to access restricted elements like hardware devices and contacts of the Android operating system. To cluster the malware behavior clustering algorithms such as hierarchical and partitioning-based clustering like K-Means or K-Medoids are used. Various clustering algorithms are discussed below: Clustering ensemble is a process to get a better clustering solution from di erent clusters as input. It aggregates clustering solutions obtained by hierarchical and partitional clustering algorithms....   [tags: Algorithm, Applications] 539 words
(1.5 pages)
Better Essays [preview]
Identity Theft and Mobile Document Shredding Essay - The U.S. Department of Justice National Crime Victimization Survey defines identity theft as covering three categories of incidents. Attempted use or use of existing credit cards not authorized by another party is one category. Attempted use or use of existing accounts, including financial accounts, without authorization, is another type. Lastly, when someone uses someone's personal information to commit crime, obtain new accounts or conduct financial transactions this is theft of identity. All of these incidents are reasons why mobile document shredding can be a timely, handy crime-stopper....   [tags: Mobile Document Shredding] 986 words
(2.8 pages)
Better Essays [preview]
Essay on Exploring Affinity Propagation - ... Various mechanism-learning investigators have found that unlabeled data, when rummage-sale in conjunction with a small amount of categorized data, can yield substantial development in learning accuracy. A) Irrelevant Based Feature Selection A feature selection algorithm may be evaluated from both the efficiency and effectiveness arguments. Although the effectiveness concerns the time requisite to find a subsection of features, the efficiency is associated to the excellence of the subsection of features....   [tags: clustering data, algorithms, statistics] 1014 words
(2.9 pages)
Research Papers [preview]
Gossip-Based Algorithms Essay - ... 1.2. Contribution: 1) To derive the broadcast time of the push-pull algorithm is O (φ−1 log n) rounds w.h.p for any graph of n vertices and start vertex. 2) A) For any graph on n vertices with degree Ω (∆ (φ+δ−1)), the broadcasting time of PULL algorithm is O (φ−1 log n). B) If ∆= O (1/φ) the bound on the broadcast time holds foe any start vertex. 2. Literature View: This paper mainly focuses on gossip protocol or rumor spreading in social network. The rumor spreading and social network both are huge....   [tags: information efficiency in large networks] 1199 words
(3.4 pages)
Research Papers [preview]
Essay on Document Based Question on the Colonies - Document Based Question on the Colonies The 1600's were a time of global expansion, and the search for a new world where people could start their lives anew and have a say in the way their society was run....   [tags: AP US History American ] 1146 words
(3.3 pages)
Good Essays [preview]
Text Clustering Essay - The idea of text clustering long preceded the computer age: “Clustering is one of the most primitive mental activities of humans, used to handle the huge amount of information they receive every day” (Theodoridis and Koutroubas, 2003: 398). The act of indexing long used in libraries is an obvious example. Manual clustering was the only type of document clustering possible prior to the computer age. This circumstance may have influenced much clustering work that relied only on immediate intuitive knowledge of the world without making use of quantitative numerical methods....   [tags: Language] 862 words
(2.5 pages)
Better Essays [preview]
Essay on The ID3 Algorithm - The ID3 Algorithm Abstract This paper details the ID3 classification algorithm. Very simply, ID3 builds a decision tree from a fixed set of examples. The resulting tree is used to classify future samples. The example has several attributes and belongs to a class (like yes or no). The leaf nodes of the decision tree contain the class name whereas a non-leaf node is a decision node. The decision node is an attribute test with each branch (to another decision tree) being a possible value of the attribute....   [tags: Classification Algorithms] 1344 words
(3.8 pages)
Strong Essays [preview]