Web Mining Using Machine Learning: A Survey

4483 Words18 Pages
Abstract: Due with the exponential growth of information on the web, web is becoming the primary source of information for everyone. As the numbers of user increasing, organizations are making their data available on the web on daily basis. Web contains structured and unstructured information that can provide meaning full insight into the web data. Due to the sheer volume and dynamic nature of web data manual extraction of knowledge is an uphill task therefore an automatic information extraction techniques with good accuracy are needed to utilize the full potential of the web as information resource and organizing and describing the web content is essential. Machine Learning approaches can efficiently be used to extract knowledge from the web data according the user requirements and interests automatically. The objective of this paper is to review the literature and provide a critical evaluation of how to improve information retrieval, extract knowledge from web contents, functionality and implementation of the proposed work. The paper focuses on the limitations of the existing IE implementation methods, evaluation techniques, models, their practices and provides overall summary

Keywords: Web mining, Information Extraction, Information Retrieval

I. INTRODUCTION

A

t present the number of web pages on the World Wide Web is increasing significantly and the web is becoming the primary source of information for everyone. The web is a popular and interactive medium to disseminate information to day [1]. There is an exponential increase of information available on the World Wide Web [10]. Currently web containing indexed 23.18* billion of web pages, with the huge amount of information available online, the World Wide Web is ...

... middle of paper ...

... “Using information filtering in web data mining”, IEEE 2007 (pp. 163-2169).

[8] L. Ma, N. Goharian, A. Chowdhury, ( 2003 , “ Extracting Unstrauctured Data from Template generated Web documents”, CIKM 2003 (pp. 163-2169).

[9] M. Tsukada, T. Washio, H. Motada, “ Automatic Web-Page Classification by using Machine learning Methods”

[10] I.Mahadevan, S. Karuppasamy, R. Ramasamy, “ Resource optimization in automatic web page classification using integrated feature selection and machine learnining”

[11] Andrej, Maria, (2005) , “Improving Adaptation in Web-Based Educational Hypermedia by means of Knowledge Discovery”,

HT,05 September6-9 ,2005

[12] M. Ester , H. Kriegel, M. Schubert, (2002), “ Web Site Mining : A new way to spot Competitors, Customers and Suppliers in the World Wide Web” ,

SIGKDD 02 2002 (pp. 249-257).
Open Document