Cranfield experiments

"The Cranfield indexing experiments in the 1960s are often cited as the beginning of the modern era of computer-based retrieval system evaluation (Cleverdon, Mills and Keen, 1966). In the Cranfield studies, retrieval experiments were conducted on a variety of test databases in a controlled, laboratory-like setting. In the second series of experiments, known as Cranfield II, alternative indexing languages constituted the performance variable under investigation. The aim of the research was to find ways to improve the relative retrieval effectiveness of IR systems through better indexing languages and methods (Cleverdon, 1970). The components of the Cranfield experiments were: a small test collection of documents, a set of test queries, and a set of relevance judgments, that is a set of documents judged to be relevant to each query. Human searchers, their interaction with the system, their interpretation of the query, and their process-formed relevance judgments were not factors included in these experiments. For purposes of performance comparisons, it was necessary to select quantitative measures of relevant documents output by the system under various controlled conditions. The measures used in the Cranfield II experiments are recall and precision, derivatives of the concept of relevance. "  (Hildreth, 2001).

 

"Theoretically, the Cranfield model relies almost entirely on the attractive, but troublesome concept of relevance. Furthermore, two key assumptions underlie the Cranfield model: users desire to retrieve documents relevant to their search queries and don’t want to see documents not relevant to their queries, and document relevance to a query is an objectively discernible property of the document. Neither of these two assumptions has stood the test of time, experience and astute analysis."   (Hildreth, 2001).

 

Hildreth also describes the Cranfield model is a classic example of the system-oriented approach to IR system effectiveness evaluation (cf., Physical paradigm).

 

In 1992 started a series of experiments known as TREC (Text REtrieval Conference) which may be seen as the most important continuation of the experimental approach to information retrieval.

 

 

Literature:

 

Cleverdon, C. W. (1970). The effect of variations in relevance assessments in comparative experimental tests of index languages. Cranfield, UK: Cranfield Institute of Technology. (Cranfield Library Report No. 3)

 

Cleverdon, C. W., Mills, J. & Keen, E. M. (1966). Factors determining the performance of indexing systems. Cranfield, UK: Aslib Cranfield Research Project, College of Aeronautics. (Volume 1:Design; Volume 2: Results)

 

Harter, S. P. & Hert, C. l. A. (1997). Evaluation of information retrieval systems: approaches, issues, and methods.  Annual Review of Information Science and Technology, 32, 3-94.

 

Hildreth, C. R. (2001). Accounting for users' inflated assessments of on-line catalogue search performance and usefulness: an experimental study. Information Research, 6(2) Available at: http://InformationR.net/ir/6-2/paper101.html

 

 

See also: Information retrieval evaluation; TREC; Test Collections

 

 

 

 

Birger Hjørland

Last edited: 02-05-2006

Home