Reverted indexes and true relevance feedback

I was fortunate enough at CIKM not only to meet the two bloggers cited in my thesis, namely Gene Golovchinsky and Jeremy Pickens of FXPAL (the latter now of Catalyst), but also to hear Jeremy present one of the more interesting papers of the conference, "Reverted Indexing for Feedback and Expansion" (co-authored with Gene and FXPAL's Matthew Cooper; the author's version is available for free download).

Gene gives an extended summary of the paper on the FXPAL blog. In brief, a reverted index maps from documents to the queries which retrieve those documents, whereas an inverted index maps from terms to the documents those terms occur in. Construction re-uses the infrastructure of the inverted index: each query is issued to the index, and result converted into a pseudo-document containing the docids of the top n = 1000 documents retrieved, with each docid repeated between 1 to 10 times, depending on its similarity score with the query. The resulting pseudo-corpus is then inverted, giving the document to query mapping.

Among the uses of a reverted index, and the use the authors put it to in their paper, is relevance feedback, where it identifies query concepts cognate to the feedback documents. The queries the authors use to construct the reverted index tested in their paper are simply the individual terms of the index vocabulary. Using this single-term reverted index, the authors achieve significantly improved effectiveness in true relevance feedback over strong baselines. In this post, I want to explore how this improved effectiveness is achieved.

The authors surmise that the reverted index is more effective because it suggests more selective expansion terms, and they reproduce example term sets as evidence. This explanation is convincing enough as far as it goes; but what is not explained is why the reverted index's expansion terms are more selective. The reason is not obvious. A single-term reverted index is not much more than a weighted direct index, mapping from documents to the terms that occur in them. Nor do the resulting term weights favour more selectivity. The (1, 10) normalization applied during reversion effectively removes their original IDF factor; and the maximum DF for terms coming out of the reverted index is capped at the n = 1000 result cutoff. If anything, reverted weights seem likely to reduce selectivity.

The reason for the enhanced selectivity of the reverted index is not, I think, the way expansion term weights are derived. Rather, the key is the n=1000 cutoff for the input retrieval results. The cutoff excludes from the reverted index altogether the great majority of occurrences of highly frequent, underly selective terms. It is not that selective terms are given more weight in the expansion; it is that unselective terms are not considered for expansion at all. This is suggested by a count of the number of top-15 expansion terms with a DF above 1000 across a sample of queries for each retrieval method, given in the table below and derived from Table 1 of the paper. The reverted index (RevPL2) suggests almost no such terms, and most of the ones that it does suggest were in the original query.

Topic Id Summary KL Bo1 RevPL2
172 "quit smoking" 11 10 2
341 "airport security" 11 10 2
407 "poaching wildlife" 7 7 3
419 "recycle tires" 10 10 1

The above discussion is in no way a criticism of the paper; the reverted index is a very clever and interesting idea, and it certainly enables a highly effective form of true relevance feedback. I would, though, be interested to see a follow-up study (or hear of existing studies) that directly addressed the question of excluding high-frequency expansion terms based on a fixed cutoff, and compared it with the seemingly more general approach of enhancing the IDF component of expansion term weighting.

4 Responses to “Reverted indexes and true relevance feedback”

  1. jeremy says:

    Hey William --

    Great to meet you at CIKM!

    Let me quickly correct one perception in your post. Gene will write a longer FXPAL blog response tomorrow, so definitely check that out. But in the meantime, here's a point to consider:

    The (1, 10) normalization applied during reversion effectively removes their original IDF factor;

    No. The original IDF factor is not removed through this process. Why? Because while a single docid's (1,10)-normalization is what effectively becomes the "TF" in the reverted index, the sum of all (1,10)-normalized scores for a particular (basis query = reverted document) becomes that reverted document's doclength. The higher the DF of a term in the standard index, the longer the basis query's (reverted document's) document length in the reverted index.

    Gene will write a longer post tomorrow talking about this, but I'll let that thought marinate for a while. It's doesn't completely dismiss your point about the n=1000 cutoff, but it does suggest that a docid that was retrieved at rank 1500 will be at that point because it had the 1500th largest retrieval score, so the 1499 scores above it contribute more to the "doclength" than it does.. meaning that it's relative (tf / dl) contribution is going to be much smaller than the docid of a basis query that is ranked 200th out of 200.

    Don't get me wrong; we probably should have done the experiment with different cutoff values. But one can make a logical argument that it probably wouldn't matter all that much. Natch?

  2. [...] difficult to answer all questions and comments thoroughly. For example, William Webber wrote up a post summarizing our work, in which he observed The authors surmise that the reverted index is more [...]

  3. [...] IREvalEtAl William Webber’s Research Blog « Reverted indexes and true relevance feedback [...]

  4. [...] on November 4th, 2010 Copied with permission from IREvalEtAl GD Star Ratingloading...Gene and Jeremy have disputed two assertions about index reversion made in my previous post. The first is that [...]

Leave a Reply