Clustering of Near Duplicate Images in the Web Search Essay

Clustering of Near Duplicate Images in the Web Search Essay

Length: 1270 words (3.6 double-spaced pages)

Rating: Better Essays

Open Document

Essay Preview

The overall objective is to cluster the near-duplicate images. Initially, the user passes the query to the search engine and the search engine results in set of query related images. These images contain duplicate as well as near-duplicate images. The main aim of this paper is to detect near-duplicate images and cluster those images. This is achieved through the following steps – Image Preprocessing, Feature Extraction and Clustering. In image processing, the initial step is preprocessing. Image preprocessing is nothing but noise removal and image enhancement. Then feature extraction includes the extraction of key points and key points matching. These matched key points are allowed for estimation of affine transform based on an affine invariant ratio of normalized lengths. At last, Clustering is performed which includes Supervised and Unsupervised Clustering. This results in cluster of images. Each of these clusters will have one image as a representative of that cluster and other images in the cluster is called its near-duplicates. At last, performance measure is calculated for the evaluation of algorithm accuracy.
Figure 1 shows the block diagram of the proposed system. It is seen that the final output will be many clusters; each consisting of near-duplicates relating to the representative cluster.

Fig. 1. Block Diagram of the Proposed System
3.1 Image Preprocessing

Pre-processing methods use a small neighborhood of a pixel in an input image to get a new brightness value in the output image; also called filtration. Local pre-processing methods can be divided into the two groups according to the goal of the processing: Smoothing suppresses noise or other small fluctuations in the image; equivalent to the suppression of high...


... middle of paper ...


...o cut. The brief idea is clustering is done around half data through Hierarchical clustering and succeed by K-means for the remaining. In order to create super-rules, Hierarchical is terminated when it generates the largest number of clusters.
Algorithm –
1. Finish a complete agglomerative Hierarchical clustering on the data and record number of clusters generated during the process.
2. Run the agglomerative Hierarchical clustering again and stop the process when largest number of clusters is generated.
3. Execute the k-means clustering on the remaining data which are not processed in the step 2 and use the centroids for every cluster in step 2 and are served as initial centroids in the k-means clustering algorithm.
After the clustering process is over, set of clusters will be found. Each cluster represents a set of near-duplicates with one representative image.



Need Writing Help?

Get feedback on grammar, clarity, concision and logic instantly.

Check your paper »

Recent Trends in Document Clustering with Evolutionary-Based Algorithms Essay examples

- Document clustering is the process of organizing a particular electronic corpus of documents into subgroups of similar text features. Previously, a number of statistical algorithms had been applied to perform clustering to the data including the text documents. There are recent endeavors to enhance the performance of the clustering with the optimization based algorithms such as the evolutionary algorithms. Thus, document clustering with evolutionary algorithms became an emerging topic that gained more attention in the recent years....   [tags: clustering, inspection, research]

Better Essays
742 words (2.1 pages)

How Search Engine Is Growing Rapidly With The Development Of Web Based Information System

- Abstract The importance of search engine is growing rapidly with the development of web based information system. Currently, search engine provides hundreds of millions of results for the query entered in which most of the results are ambiguous or even irrelevant to the context and so we have to spend a lot of time to filter out the required information. This difficulty can be overcome by using the advance search engine which incorporates Natural Language Processing technology to narrow the boundary of search results and save the extra time in filtering the required information....   [tags: World Wide Web, Web search engine, Web crawler]

Better Essays
973 words (2.8 pages)

Google Is A Search Engine Essays

- We live in a world that is fundamentally run by the internet. We use it every day, multiple times in a variety of ways. We use it to keep up to date with the latest news, to watch movies and shows, to listen to music, do research for our latest assignment and even to simply search the latest topic our mind is wandering off to. Google is there to help with all of that. Google is a search engine that was created in September of 1998 by two Ph.D. students at the University of Stanford (Google, 2014)....   [tags: Google, Google search, Web search engine]

Better Essays
834 words (2.4 pages)

Standard Search Engines And The Deep Web Essay example

- ... It is estimated that the marketplace had accumulated 1,400 vendors, 957,079 registered users, and had brokered more than 1.2 million transactions worth $214 million dollars, according to the FBI.” (Thompson). So, this begs the question; how was this abundance of money was made, and how was it received. Well, the answer to this is that it should come as no surprise that a large amount of deep websites (and even popular surface web retailers) accept cryptocurrency, or “virtual currency” as a legitimate form of payment....   [tags: World Wide Web, Deep Web, Surface Web, Crime]

Better Essays
1307 words (3.7 pages)

Semantic Web: An Enhancement of the Current Web Essay

- The vast content of the World-Wide Web is used by millions. Many users employs a search engine to begin their Web activity. The query is usually a list of keywords, and the result returned is also a list of Web pages that may or may not be relevant, typically pages that contain the keywords [4]. The web of today lacks metadata which can be read by other computers. Metadata is data about data, such that, it would be possible to distinguish between 1984 (a number), 1984 (a date), 1984 (a film starring John Hurt) and Nineteen Eighty-Four (a novel by George Orwell)....   [tags: web of tomorrow, search engine, query]

Better Essays
1040 words (3 pages)

Essay about The Effects of Words Clustering on Memory

- As world spin and time pass by, we learn a lot of new things every day. Some we may remember immediately but some are impossible to do it. Thus we as a human try a lot of ways to make every day life easier. Word clustering is one of it. Clustering is meant “similar” or “same”. The way we use it is, when we a given a thing to do, we will categorize same thing that have similar or same pattern or any common pattern. This word clustering seems to give a different impact on every people. We have two types of memory, long term memory and short term memory....   [tags: Words Clustering]

Better Essays
2253 words (6.4 pages)

Search Engine Optimization / Seo Essay

- ... By organizing these content, the search engines make it possible for their users to find the most relevant answer to their search in the shortest time possible. This is only possible with the help of SEO. If we thought of internet as a library and all the internet users as people searching for a book, then the librarian having the responsibility to help all these people would be search engines mentioned previously. Then, the primary tool of the search engines would be SEO. What is the importance of SEO....   [tags: World Wide Web, Search engine optimization]

Better Essays
705 words (2 pages)

Integration Stategies of Meta Search Engines Essays

- 1 Introduction Standard Web search services are quite useful in their own right, but are far from ideal. Search engines retrieve web pages which contain information relative to the subject which the user queries. Meta search unlike standard search, utilizes many di erent search systems to provide results. The Meta Search Engine (MSE) is a system that enables a meta search[3]. Meta search engines are Web services that receive user queries and dispatch them to multiple crawl-based search engines (also called component engines)[1]....   [tags: web search, web services]

Better Essays
1625 words (4.6 pages)

Google 's World, We Just Search Essay example

- “It’s Google’s world, we just search in it” We live in a world that is fundamentally run by the internet. We use it every day, multiple times in a variety of ways. We use it to keep up to date with the latest news, to watch movies and shows, to listen to music, do research for our latest assignment and even to simply search the freshest topic our mind is wandering off to. Google is there to help with all of that. Google is a search engine that was created in September 1998 by two Ph.D. students at the University of Stanford (Google, 2014)....   [tags: Google, Google search, Web search engine]

Better Essays
877 words (2.5 pages)

Search Engines Essay

- Search Engines are specialized programs that facilitate the retrieval of data from the Internet, on a business related network or on a personal computer system. They allow users to ask for contents meeting certain criteria usually involving a certain word or phrase, then gives a list of articles that matches those words or phrases. There are number of different engines that can be used for different types of searches and can be narrowed for optimal results. Before the 1990¡¦s Search Engines were non existent....   [tags: Internet Google Search History Web]

Better Essays
1815 words (5.2 pages)