Course Syllabus
Introduction to IR: Space Retrieval Models – Ranked Retrieval – Text Similarity Metrics – Tokenizing- stemming-Evaluations on benchmark text collections – Components of an information retrieval system. Indexing for IR: Inverted Indices – Postings lists – Optimizing indices with skip lists – Proximity and phrase queries – Positional indices – Dictionaries and tolerant retrieval – Dictionary data structures – Wild-card queries- n-gram indices – Spelling correction and synonyms – Edit distance – Index construction – Dynamic indexing – Distributed indexing – real-world issues. Relevance in IR: Parametric or fielded search – Document zones – Vector space retrieval model – tf.idf weighting – queries as vectors – Computing scores in a complete search system – Efficient scoring and ranking – Evaluation in information retrieval : User happiness- Creating test collections: kappa measure-interjudge agreement – Relevance feedback and query expansion: Query expansion – Automatic thesaurus generation – Sense-based retrieval -. Document Classification and Clustering: Introduction to text classification -Latent Semantic Indexing.