Preview
Preview

Recent Trends in Document Clustering with Evolutionary-Based Algorithms Essay examples

:: 80 Works Cited
Length: 742 words (2.1 double-spaced pages)
Rating: Yellow      
Open Document
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Document clustering is the process of organizing a particular electronic corpus of documents into subgroups of similar text features. Previously, a number of statistical algorithms had been applied to perform clustering to the data including the text documents. There are recent endeavors to enhance the performance of the clustering with the optimization based algorithms such as the evolutionary algorithms. Thus, document clustering with evolutionary algorithms became an emerging topic that gained more attention in the recent years. This paper presents an up-to-date review fully devoted to evolutionary algorithms designed for document clustering. Its firstly provides comprehensive inspection to the document clustering model revealing its various components and related concepts. Then it shows and analyzes the principle research work in this topic. Finally, it brings together and classifies various objective functions from the collection of research papers. The paper ends up by addressing some important issues and challenges that can be subject of future work.


The objective function (or fitness function) is the measure that evaluates the optimality of the generated evolutionary algorithm solutions in the search space. In clustering domain, the fitness function refers to the adequacy of the partitioning. Accordingly, it needs to be formulated carefully, taken into consideration that the clustering is an unsupervised process.
Different objective functions generate different solutions even form the same evolutionary algorithm. Presuming also that the fitness could either be a minimization or a maximization function. Moreover, the algorithm could be formulated with one or with multi objective functions. To sum up, "choosing optimizati...


... middle of paper ...


...traction. 1999.
76. Turney, P.D., Learning algorithms for keyphrase extraction. Information Retrieval, 2000. 2(4): p. 303-336.
77. Wu, J.-l. and A.M. Agogino, Automating Keyphrase Extraction with Multi-Objective Genetic Algorithms. Proceedings of the Hawaii International Conference on System Science, HICSS 2003, 2003.
78. Sathya, A.S. and B.P. Simon, A document retrieval system with combination terms using genetic algorithm. International Journal of Computer and Electrical Engineering, 2010. 2(1): p. 1-6.
79. Dorfer, V., et al. Optimization of keyword grouping in biomedical information retrieval using evolutionary algorithms. 2010.
80. Dorfer, V., et al., On the performance of evolutionary algorithms in biomedical keyword clustering, in Proceedings of the 13th annual conference companion on Genetic and evolutionary computation2011, ACM: Dublin, Ireland. p. 511-518.




Click the button above to view the complete essay, speech, term paper, or research paper








This essay is 100% guaranteed.


Title Length Color Rating  
Essay about The Effects of Words Clustering on Memory - As world spin and time pass by, we learn a lot of new things every day. Some we may remember immediately but some are impossible to do it. Thus we as a human try a lot of ways to make every day life easier. Word clustering is one of it. Clustering is meant “similar” or “same”. The way we use it is, when we a given a thing to do, we will categorize same thing that have similar or same pattern or any common pattern. This word clustering seems to give a different impact on every people. We have two types of memory, long term memory and short term memory....   [tags: Words Clustering] 2253 words
(6.4 pages)
Strong Essays [preview]
CILOP: Clustering Based Method for Class Imbalance Learning Using Optics - CHAPTER 3 CILOP: CLUSTERING BASED METHOD FOR CLASS IMBALANCE LEARNING USING OPTICS The present chapter proposes a novel approach that is Clustering based Class Imbalance Learning using OPTICS [49] method for improvement of class imbalance learning. The content of this chapter is published in “International Journal of Computer Applications”, Page No: 33-42, Volume 51-No 16, August 2012, K.Nageswara Rao , T.Venkateawara Rao and D.Rajya Lakshmi, “A Novel Class Imbalance Learning using Ordering Points Clustering”....   [tags: database, framework, algorithm, subset] 2687 words
(7.7 pages)
Powerful Essays [preview]
Clustering in Financial Services: A Literrature Review Essay - The theory of cluster has became one of theory that considered important in the regional economic development theoryin the recent worlds. It suggest that the co-ocation or geographic proximity results for firms that do clustering will be crucial to increase the economics scope of the firms, mainly due to lower input cost that are resulted from agglomeration economies and facilitates knowledge spillovers which can increase the firms’ productivity (Wolman and Hincapie, 2010, p1). Therefore, this can creates more competitive firms that do the clustering, and also produce significant growth for the firms....   [tags: markusen, clustering, financial services]
:: 12 Works Cited
2165 words
(6.2 pages)
Term Papers [preview]
Clustering: Keeping Malware Out in Android Applications Essay - Due to the existence of malware samples in large amount of data malware detection techniques are introduced. Machine learning techniques are being applied to classify the applications focusing malware detection. Android has impressive growth in the domain of smart phones. Hence to overcome its better to group malware samples with structural similarities. Clustering technique in Android applications is an important technique in machine learning and gives automatic classi cation of applications by categorizing malware....   [tags: Algorithm, Applications] 539 words
(1.5 pages)
Better Essays [preview]
Identity Theft and Mobile Document Shredding Essay - The U.S. Department of Justice National Crime Victimization Survey defines identity theft as covering three categories of incidents. Attempted use or use of existing credit cards not authorized by another party is one category. Attempted use or use of existing accounts, including financial accounts, without authorization, is another type. Lastly, when someone uses someone's personal information to commit crime, obtain new accounts or conduct financial transactions this is theft of identity. All of these incidents are reasons why mobile document shredding can be a timely, handy crime-stopper....   [tags: Mobile Document Shredding] 986 words
(2.8 pages)
Better Essays [preview]
Research Study- Improved Algorithms for Yield Driven Clock Skew Scheduling in the Presence of Process Variations - Abstract Traditional yield driven clock skew scheduling in the presence of process variations can be formulated as a sequence of minimum ratio cycle problems, and hence can be solved efficiently by algorithms such as Lawler's and Howard's algorithms. However, the assumption of Gaussian distributions of critical path delays has been made in this formulation, which becomes inapplicable for next generation nanometer technologies. Recently, a generalization of the formulation for non-Gaussian distributions was proposed, and a modification of Lawler's algorithm was developed for solving this generalized problem....   [tags: algorithms]
:: 6 Works Cited
1776 words
(5.1 pages)
Powerful Essays [preview]
Essay on Document Based Question on the Colonies - Document Based Question on the Colonies The 1600's were a time of global expansion, and the search for a new world where people could start their lives anew and have a say in the way their society was run....   [tags: AP US History American ] 1146 words
(3.3 pages)
Good Essays [preview]
Text Clustering Essay - The idea of text clustering long preceded the computer age: “Clustering is one of the most primitive mental activities of humans, used to handle the huge amount of information they receive every day” (Theodoridis and Koutroubas, 2003: 398). The act of indexing long used in libraries is an obvious example. Manual clustering was the only type of document clustering possible prior to the computer age. This circumstance may have influenced much clustering work that relied only on immediate intuitive knowledge of the world without making use of quantitative numerical methods....   [tags: Language] 862 words
(2.5 pages)
Better Essays [preview]
Data Mining Essay - Basics of Algorithm Algorithms are the basic building blocks of life, like an atom to molecules. Programmers have developed some techniques when it comes to organizing complex algorithms. There are blocks when it comes to the making up parts of algorithms operations and one block is considered the series of the rest of the building blocks. The general duties of an algorithm is to do what a programmer wants the computer to complete (tasks). The computer will execute the program if the algorithm steps are written correctly for the program to understand....   [tags: algorithm, programmers, customer clustering]
:: 4 Works Cited
1114 words
(3.2 pages)
Strong Essays [preview]
Essay on The ID3 Algorithm - The ID3 Algorithm Abstract This paper details the ID3 classification algorithm. Very simply, ID3 builds a decision tree from a fixed set of examples. The resulting tree is used to classify future samples. The example has several attributes and belongs to a class (like yes or no). The leaf nodes of the decision tree contain the class name whereas a non-leaf node is a decision node. The decision node is an attribute test with each branch (to another decision tree) being a possible value of the attribute....   [tags: Classification Algorithms] 1344 words
(3.8 pages)
Strong Essays [preview]