"Information Retrieval Experiment" (1981)

In preparation for writing the background chapter of my thesis, I have been reading some of the early literature on information retrieval evaluation. The classic text on this early work is the collection of essays edited by Karen Sparck Jones and published in 1981 under the title "Information Retrieval Experiment". Actually getting a hold of the book, though, is a bit of a challenge. There is an online version in the SIGIR museum, but it's hidden behind a natty Digital Library-esque flash interface (animated page turns, no less!) which crawlers don't seem able to pierce, so the book doesn't turn up in search results. (It is ironic that a classic of information retrieval is not retrievable by modern information retrieval engines.) (Update 10th August, 2009: a search for "sparck jones information retrieval experiment" now brings up the above online version as the first result on Google, and the second result on Bing, but that has only happened since I linked to it.) Be warned that the scanned PDFs are enormous---the full book is over three-quarters of a gigabyte in size. Also, the chapter links have an off-by-one error: clicking on chapter 2 will give you (a 50 megabyte download later) chapter 3, and so forth.

These minor obstacles aside, though, the book is well worth a browse, at least if you have an interest in the history of IR. I might blog a few specific items later, but the thing that strikes me most overall is how much of the essential insights were laid down quite early in the field, how the same concerns recur over time, and how little progress there has been in the science and experimental methodology of IR since. During the 1970s, TF-IDF, probabilistic models, term clustering, automatic relevance feedback, and document clustering had all been investigated. Retrieval performance seemed to have peaked around a 50% precision, 50% recall level. There were already calls to move beyond system-centric to user-centric studies, and to move beyond descriptive and evaluative to predictive and explanatory experiments, both of which were topical at this year's SIGIR. All of which again raises the question, has the apparent success of the information retrieval research community in developing a strong experimental methodology around the Cranfield paradigm and the TREC collections been as much an obstacle as an aid to real progress in the field?

2 Responses to “"Information Retrieval Experiment" (1981)”

  1. I agree with you that the IR community has been stuck in an evaluation rut. It's unclear, however, how to break the addiction to the Cranfield paradigm when hard numbers are considered the sine qua non of evaluation. More on my blog post Is TREC good for Information Retrieval research?.

  2. Stefano Mizzaro says:

    William, concerning your question "has the apparent success of the information retrieval research community in developing a strong experimental methodology around the Cranfield paradigm and the TREC collections been as much an obstacle as an aid to real progress in the field?", I'd answer "Yes, at least to some extent".

    Indeed I wrote in a paper last year: "Pushing this line of reasoning to its extreme consequences, one might wonder if evaluation somehow hinders development in the IR field. Indeed, when a paradigm shift is going to happen, it is quite unlikely that current evaluation methodologies can cope with the new scenario: they will evaluate the revolutionary system on the basis of the current evaluation techniques, that could be not appropriate for such a system and could lead to negative results. [...] This is an extremist’s position, to be taken with caution, since evaluation is fundamental for new systems [...]"

    Having said this, since you and me are working mainly on IR evaluation, perhaps we shouldn't publicize this too much :-)

    S.

Leave a Reply