Consecutive Time Sensitive Queries Using TF-IDF
T.Satyanarayana1, CH.Raja Jacob2
Citation : M. Madhavi, CH. Raja Jacob, Consecutive Time Sensitive Queries Using TF-IDF International Journal of Research Studies in Computer Science and Engineering 2014, 1(8) : 53-60
Search Engines uses an approach that might help to identify how relevant the results it displays to searchers might actually be, and how likely those results are to show a variety of results when a searcher uses a query term that might cover a range of topics in future. For an important class of queries termed time-sensitive queries over frequently updated archives such as news archives, topic similarity alone is not sufficient for ranking. For such queries, the publication time of the documents is important and should be considered in conjunction with the topic similarity to derive the final document ranking. For incorporating the time dimension, prior systems used an estimation algorithm that considers publication date and time of the documents to locate time periods of interest. However, a document published on the same context at a later date (e.g., a review article, summarizing an event) may also be relevant; We propose to infer the temporal relevance of a document by analyzing its contents, and not by relying solely on its publication date thus increasing the relevancy of the results. So we propose to use Tf-idf, term frequency-inverse document frequency a numerical statistic method, that reflects how important a word is to a document in a collection or corpus. We emulate the performance of the estimiation algorithm in combination with tf-idf weights for detecting the important time intervals for a query over a news archive and for incorporating this information in the retrieval process. We show that our techniques are robust and significantly improve result quality for time-sensitive queries compared to state-of-the-art retrieval techniques.