Publication Type : Conference Paper
Publisher : 2017 International Conference on Advances in Computing, Communications and Informatics
Source : 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (2017)
Url : http://ieeexplore.ieee.org/document/8126051/
Keywords : Crawlers, Focused Web crawler, Google, Large scale integration, Page rank, Search engines, Semantic similarity, Semantics, Uniform resource locators, Vertical Search Engines, Web pages
Campus : Amritapuri
School : Department of Computer Science and Engineering, School of Engineering
Department : Computer Science
Year : 2017
Abstract : The main goal of focused web crawlers is to retrieve as many relevant pages as possible. However, most of the crawlers use page rank algorithm to lineup the pages in the crawler frontier. Since the page rank algorithm suffers from the drawback of “Richer get rich phenomenon”, focused crawlers often fail to retrieve the hidden relevant pages. This paper presents a novel approach for retrieving the hidden and relevant pages by combining rank and semantic similarity information. The model is validated by crawling the real web with different topics and the results are promising.
Cite this Research Publication :
K. Pavani and Dr. Sajeev G. P., “A Novel Web Crawling Method for Vertical Search Engines”, in 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2017