When did the Cranfield tests become the "Cranfield paradigm"?

It is common these days to see the traditional method of evaluating an information retrieval system against a test collection referred to as the "Cranfield paradigm". For instance, Emine Yilmaz and Javed Aslam in their 2006 CIKM paper, Estimating Average Precision with Incomplete and Imperfect Judgments, denote "the test collection methodology adopted by TREC" as "the Cranfield paradigm", and similar uses can be found in recent papers by Sakai, Scholer et al., Harman and Hiemstra, and many others besides. It is such a distinctive usage, that I came to wonder when it was introduced.

The phrase "Cranfield paradigm" does not, of course, appear in any of the Cranfield reports themselves, nor in the early literature describing the experiments at Cranfield. Contributors to Sparck Jones's 1981 book Information Retrieval Experiment speak of work done in the same "tradition" as Cranfield (Sparck Jones, page 2), of "the 'normal' or archetypal retrieval test" of which Cranfield is an (but not the only) example (Robertson, page 19), or a "body of practice" based on Cranfield and later investigations (Tague, p 59), but nowhere are paradigms mentioned, nor is Cranfield even treated in a particularly paradigmatic way (despite a chapter being devoted to the Cranfield tests, and the book being dedicated to Cleverdon, the director of those tests). By the time of the 1992 Information Processing and Management special issue on information retrieval evaluation, the word "paradigm" had entered the lexicon, with Donna Harman observing in the introduction that "the test collection paradigm has ... caused some major problems", Tague-Sutcliffe declaring that "a paradigmatic shift has occurred in the research front, to user-centered from system-centered models" (page 467), and Michael Keen noting that "there is no perfect paradigm for the laboratory test" (page 491). Robertson and Hancock-Beaulieu even talk about the lack of "any kind of paradigm or consensus" regarding the concept of relevance (page 458). However, while a reference to Cranfield often lurks nearby, none of these authors actually use the phrase "Cranfield paradigm" directly. It seems that it had not yet entered mainstream usage.

The first usage of the phrase "Cranfield paradigm" appears (judging in part from Google Scholar) to be in an early, little-cited paper by B. C. Brookes, presented at SIGIR in 1980. Brookes sets out to "question the continued usefulness of what I call the `Cranfield paradigm'", a formulation that suggests that Brookes is introducing what he believes to be a novel usage. Brookes' paper is a discursively theoretical one, reflecting on the theory of science, Shannon's definition of information, whether "information retrieval" should actually be called "document retrieval", whether it should be measured on a linear or a logarithmic scale, as well as philosophical monism and dualism, the nineteenth century debate between the vitalist and physicalist schools of organic chemistry, and other such matters. He cites Bishop Berkeley, Socrates, and Einstein, describes Karl Popper's World 3, and quotes Thomas Kuhn at length (of whom more later). Brookes ends in a manner not frequently repeated in later SIGIR papers by stating that "we need a firmer metaphysic for our studies".

Brookes's paper did not cause a revolution in the science of information retrieval, nor does it seem to have popularised the phrase "Cranfield paradigm" (which he repeats in a 1983 paper in the Journal of Information Science). The next usage appears to be in Towards an information logic, a paper presented by Keith van Rijsbergen at SIGIR 1989. This has a section entitled "The Cranfield Paradigm", which van Rijsbergen defines as one in which relevance is treated as a hidden variable, only indirectly accessible by collecting data from the user, a method which "represents an extreme descriptivist approach to the science of IR" (page 78). Van Risjbergen also goes on to claim (rather boldly, given subsequent history) that the use of IR techniques in multimedia-rich environments means that "one might say that we have come to the end of the empirical era in IR" (page 79).

Van Risjbergen does not cite Brookes, so whether his use of the phrase derives from Brookes, or is an independent coinage, or is derived from somewhere else, is unclear. An author (one of the few) who does cite Brookes' work is David Ellis, in his 1984 article Theory and explanation in information retrieval research, but without reference to the "Cranfield paradigm". However, paradigms later become a recurrent theme in Ellis's work, beginning in 1992 with The Physical and Cognitive Paradigms in Information Retrieval Research. Ellis traces the concept of a paradigm back to Kuhn, and then attempts to describe what makes Cranfield a paradigm in the Kuhnian sense. For Ellis, Cranfield is a "physical paradigm": a model of the information retrieval system as a physical machine, and of the retrieval experimentation as a physical experiment. Ellis quotes Cleverdon's description of the Cranfield approach as being like testing in a wind-tunnel to underline this point.

According to Ellis (page 52), by 1992 references to the "Cranfield paradigm" had become commonplace:

In the literature of information retrieval research it is not uncommon to find references to the `Cranfield paradigm' or to the `cognitive paradigm', often accompanied by arguments for the rejection of the former and acceptance of the latter.

He does not, however, cite examples, and, as we have seen, while talk of paradigms is frequent in the 1992 IPM special issue, the phrase "Cranfield paradigm" does not itself appear. Indeed, a search on Google Scholar suggests that it was not until some time later that the phrase first took hold, as the following graph illustrates.

Number of publications per year containing the phrase "Cranfield paradigm", as reported by Google Scholar.

Number of publications per year containing the phrase "Cranfield paradigm", as reported by Google Scholar. (Click for larger image.)

We can see from the above figure that the phrase "Cranfield paradigm" is hardly to be found in the 80s and early 90s (we have discussed above all occurrences that I have been able to locate). The phrase begins to appear occasionally in 1999 and 2000, but usage does not really take off until 2004. The earlier uses are in infrequently cited papers, although the diversity of the occurrences, and the lack of cross-references, suggests that the phrase had spread in informal contexts. It is used by Stephen Robertson in 1999, and makes its reappearance in SIGIR in 2000; but neither of these papers is much cited.

The real spur to the widespread use of the phrase "Cranfield paradigm" appears to be its adoption by Ellen Voorhees. Voorhees first refers to the phrase in a paper at CLEF 2001, The philosophy of information retrieval evaluation (the proceedings were published in 2002, so this reference has been counted in that year). There, she initially refers to the "Cranfield evaluation paradigm", but then switches to simply the "Cranfield paradigm", an expression used throughout the paper (I count 9 occurrences), including as a section heading, without citation to any earlier use. Voorhees explicitly describes TREC and its offspring workshops, CLEF and NTCIR, as being examples of the Cranfield paradigm. Exactly what this paradigm consists of is not defined, although a number of assumptions underlying it (completeness of relevance judgments, representativeness of a single set of relevance judgments, approximation of relevance by topical similarity) are identified and discussed.

Voorhees's CLEF 2001 paper is quite widely cited, particularly for a workshop paper -- over 100 times, according to Google Scholar. More widely cited, though, is her joint paper with Chris Buckley, Retrieval evaluation with incomplete information, appearing at SIGIR 2004. This paper mentions the "Cranfield paradigm" only once, in the opening sentence, with a reference back to Voorhees' CLEF publication. However, the SIGIR 2004 paper is a very influential one, the initiator of a whole line of work (including some of my own) on dealing with incomplete relevance judgments during system evaluation. It is likely, therefore, that it is this usage that had the greatest part to play in the proliferation of references to the "Cranfield paradigm" from 2004 onwards.

Voorhees's (re-)formulation and usage of the phrase "Cranfield paradigm" was significant not just because of the impact of her publications. An equally important consideration is that she is the current director of the TREC project, which has been the dominant force in information retrieval evaluation over the past decade and a half. By explicitly stating, as she does at CLEF 2001, that TREC is following the Cranfield paradigm, she gave concrete form to this paradigm -- as well as placing Cranfield's imprimatur on the TREC effort and test collections. The implication is that TREC forms part of a continuous legacy, going back to the Cranfield experiments of the late 1950s and early 1960s. However, as we have seen, the phrase "Cranfield paradigm" itself is a neologism, rarely used prior to the last decade, and when used done so in quite a different intellectual context.

The very fact that the phrase "Cranfield paradigm" has been so widely adopted since being introduced into mainstream usage by Voorhees suggests that there was a strong, latent demand for the term -- that there was a concept or phenomenon that was widespread but lacked an agreed tag. However, such phrases are more than simple, contentless tags: they carry with them deeper meanings and implications. In particular, the term "paradigm" has a particular significance in the theory of science, as the occasional earlier references to a "Cranfield paradigm" suggest. Using the phrase requires being aware that it is making a claim. At the very least, a citation is indicated, and not (as is sometimes done) back to the original Cranfield experiments, but rather (as other authors do) to Voorhees's introduction of the phrase into recent usage.

As to what the significance of the phrase "Cranfield paradigm" is, and whether, when, and how the phrase should be used ... well, this entry is long enough already. But I hope to address the question in a later post.

5 Responses to “When did the Cranfield tests become the "Cranfield paradigm"?”

  1. You can take the man out of the history department . . .

  2. william says:

    Wait, wait, I'm going somewhere with this...

  3. Having an easy to reference label (meme?) in some ways makes it easier to justify structuring your research. "I didn't make this stuff up; that other guy did." This in turn makes it easier to establish (intentionally or by accretion of precedent) a paradigm the way Kuhn meant it, that is a way of thinking about a problem that eclipses other ways. I think that to some extent that's what has happened in IR (and in SIGIR particularly) with the near-dominance of a single evaluation methodology.

  4. william says:

    Gene,

    Yes, I agree it is useful to have a handy label for "what we do with TREC collections", and also that our method has become rigidified. Perhaps "the Cranfield stereotype" is more appropriate than "the Cranfield paradigm"? Anyway, I'll blog some more later about the significance and appropriateness of the "paradigm" moniker.

  5. How about "The Cranfield Orthodoxy?" :-)

Leave a Reply