Back close

Information retrieval and processing system for news articles in English

Publication Type : Conference Paper

Publisher : 2019 9th International Conference on Advances in Computing and Communication (ICACC)

Authors : Sandhya Harikumar, Gnana Venkata Naga Sai Kalyan Karumudi; Rohit Sathyajit;

Source : Proceedings of the 2019 9th International Conference on Advances in Computing and Communication, ICACC 2019this link is disabled, 2019, pp. 79–85, 8986223

Keywords : news scraping,information extraction,web crawling,named entity recognizer,latent dirichlet allocation,text similarity

Campus : Amritapuri

School : Department of Computer Science and Engineering

Department : Computer Science

Year : 2019

Abstract : The importance of news media is unquestionable. These news media contain a great amount of information hidden along the lines of the articles. For analytics, extracting information and organizing the information to draw out conclusion is very important. The objective of our work is to focus on designing a tool to extract details from English news articles and present it to the user, in an organized manner. A predefined set of websites are crawled and the details are stored. The details extracted by the tool are named entities such as location, person, and organization mentioned in the news, news summary and important keywords pertaining to each news article. We also equip the tool with an efficient search engine, along with database indexing for faster information retrieval.

Admissions Apply Now