Figure 1 shows the block diagram of the proposed system. It is seen that the final output will be many clusters; each consisting of near-duplicates relating to the representative cluster.
Fig. 1. Block Diagram of the Proposed System
3.1 Image Preprocessing
Pre-processing methods use a small neighborhood of a pixel in an input image to get a new brightness value in the output image; also called filtration. Local pre-processing methods can be divided into the two groups according to the goal of the processing: Smoothing suppresses noise or other small fluctuations in the image; equivalent to the suppression of high...
... middle of paper ...
...o cut. The brief idea is clustering is done around half data through Hierarchical clustering and succeed by K-means for the remaining. In order to create super-rules, Hierarchical is terminated when it generates the largest number of clusters.
1. Finish a complete agglomerative Hierarchical clustering on the data and record number of clusters generated during the process.
2. Run the agglomerative Hierarchical clustering again and stop the process when largest number of clusters is generated.
3. Execute the k-means clustering on the remaining data which are not processed in the step 2 and use the centroids for every cluster in step 2 and are served as initial centroids in the k-means clustering algorithm.
After the clustering process is over, set of clusters will be found. Each cluster represents a set of near-duplicates with one representative image.
Need Writing Help?
Get feedback on grammar, clarity, concision and logic instantly.Check your paper »
- Document clustering is the process of organizing a particular electronic corpus of documents into subgroups of similar text features. Previously, a number of statistical algorithms had been applied to perform clustering to the data including the text documents. There are recent endeavors to enhance the performance of the clustering with the optimization based algorithms such as the evolutionary algorithms. Thus, document clustering with evolutionary algorithms became an emerging topic that gained more attention in the recent years.... [tags: clustering, inspection, research]
742 words (2.1 pages)
- Abstract The importance of search engine is growing rapidly with the development of web based information system. Currently, search engine provides hundreds of millions of results for the query entered in which most of the results are ambiguous or even irrelevant to the context and so we have to spend a lot of time to filter out the required information. This difficulty can be overcome by using the advance search engine which incorporates Natural Language Processing technology to narrow the boundary of search results and save the extra time in filtering the required information.... [tags: World Wide Web, Web search engine, Web crawler]
973 words (2.8 pages)
- We live in a world that is fundamentally run by the internet. We use it every day, multiple times in a variety of ways. We use it to keep up to date with the latest news, to watch movies and shows, to listen to music, do research for our latest assignment and even to simply search the latest topic our mind is wandering off to. Google is there to help with all of that. Google is a search engine that was created in September of 1998 by two Ph.D. students at the University of Stanford (Google, 2014).... [tags: Google, Google search, Web search engine]
834 words (2.4 pages)
- ... It is estimated that the marketplace had accumulated 1,400 vendors, 957,079 registered users, and had brokered more than 1.2 million transactions worth $214 million dollars, according to the FBI.” (Thompson). So, this begs the question; how was this abundance of money was made, and how was it received. Well, the answer to this is that it should come as no surprise that a large amount of deep websites (and even popular surface web retailers) accept cryptocurrency, or “virtual currency” as a legitimate form of payment.... [tags: World Wide Web, Deep Web, Surface Web, Crime]
1307 words (3.7 pages)
- The vast content of the World-Wide Web is used by millions. Many users employs a search engine to begin their Web activity. The query is usually a list of keywords, and the result returned is also a list of Web pages that may or may not be relevant, typically pages that contain the keywords . The web of today lacks metadata which can be read by other computers. Metadata is data about data, such that, it would be possible to distinguish between 1984 (a number), 1984 (a date), 1984 (a film starring John Hurt) and Nineteen Eighty-Four (a novel by George Orwell).... [tags: web of tomorrow, search engine, query]
1040 words (3 pages)
- As world spin and time pass by, we learn a lot of new things every day. Some we may remember immediately but some are impossible to do it. Thus we as a human try a lot of ways to make every day life easier. Word clustering is one of it. Clustering is meant “similar” or “same”. The way we use it is, when we a given a thing to do, we will categorize same thing that have similar or same pattern or any common pattern. This word clustering seems to give a different impact on every people. We have two types of memory, long term memory and short term memory.... [tags: Words Clustering]
2253 words (6.4 pages)
- ... By organizing these content, the search engines make it possible for their users to find the most relevant answer to their search in the shortest time possible. This is only possible with the help of SEO. If we thought of internet as a library and all the internet users as people searching for a book, then the librarian having the responsibility to help all these people would be search engines mentioned previously. Then, the primary tool of the search engines would be SEO. What is the importance of SEO.... [tags: World Wide Web, Search engine optimization]
705 words (2 pages)
- 1 Introduction Standard Web search services are quite useful in their own right, but are far from ideal. Search engines retrieve web pages which contain information relative to the subject which the user queries. Meta search unlike standard search, utilizes many dierent search systems to provide results. The Meta Search Engine (MSE) is a system that enables a meta search. Meta search engines are Web services that receive user queries and dispatch them to multiple crawl-based search engines (also called component engines).... [tags: web search, web services]
1625 words (4.6 pages)
- “It’s Google’s world, we just search in it” We live in a world that is fundamentally run by the internet. We use it every day, multiple times in a variety of ways. We use it to keep up to date with the latest news, to watch movies and shows, to listen to music, do research for our latest assignment and even to simply search the freshest topic our mind is wandering off to. Google is there to help with all of that. Google is a search engine that was created in September 1998 by two Ph.D. students at the University of Stanford (Google, 2014).... [tags: Google, Google search, Web search engine]
877 words (2.5 pages)
- Search Engines are specialized programs that facilitate the retrieval of data from the Internet, on a business related network or on a personal computer system. They allow users to ask for contents meeting certain criteria usually involving a certain word or phrase, then gives a list of articles that matches those words or phrases. There are number of different engines that can be used for different types of searches and can be narrowed for optimal results. Before the 1990¡¦s Search Engines were non existent.... [tags: Internet Google Search History Web]
1815 words (5.2 pages)