Boolean search
Boolean retrieval is based on Boolean algebra, named after the English
mathematician George Boole (1815-1864). Boolean algebra has had much influence
in computer science and plays a very important role in information retrieval
(IR). Often it is taken as something given that cannot be questioned. The
following quote by Thiel demonstrates, however, that fundamental questions have
been raised in the history of logic:
"Boole's algebraic methods of development and elimination led to modern normal forms and decision procedures, and designers of digital computers
use improved Boolean methods for realizing logical functions by minimal chains
of operations. Edmund Husserl's criticism of Schröder for favoring extentional
logic opened a controversy about a "logic of content" (Inhaltslogik), a
controversy which was however ultimately bypassed by another style of doing
logic with "implicational" calculi. It seems that metaphysical and even
ontological considerations have played a motivational part in this development,
and separated logicians according to their interpretation of logical systems
(the most interesting question being that of "existential import"), but that
they have not influenced the development of (Boolean or general) algebra of
logic in any significant way". (Thiel, 1991).
Boolean retrieval is commonly considered one among the three most important models for Information Retrieval (IR), the two others being the vector model, and the probabilistic model.
In Boolean retrieval are words, strings or other symbols organized in sets, which are combined with the logical "AND", "OR" or "NOT" (the Boolean operators). Each operator - and a complete search profile - can be visually described by using Venn diagrams, as shown below. The Boolean operators correspond to, respectively: AND: "intersection" ( • or Λ or ∩ ); OR: "union" ( + or U or V) and NOT: "difference" (- or ¬ or ¯ , the last symbol placed above the expression being negated).
The NOT command is known to be problematic to use. If a document it titled: Everything about domestic animals except dogs, and a search command is made "domestic animals NOT dogs", this document will not be retrieved because it contains the word dogs.
Boolean search technique is a form of exact match
in that given documents or document representations either fulfill the demands
or fail to fulfill the demands expressed in search logic.
Criticism:
It has been said that a serious limitation by Boolean
searches is that it is based on a two-value logic. Documents are divided in two
sets: Relevant documents and non-relevant documents. However, in real life are
most documents more or less relevant. It has also been mentioned as a limitation
that Boolean logic only concerns the relevance of documents, not their mutual
relations.
Hancock-Beaulieu (1990) writes:
"The limitations and inadequacy of Boolean operators and command languages for retrieval have long been recognized. However, the experimentation and exploration of alternative approaches which have been undertaken have had no discernible effect on the commercial hosts to date". (Hancock-Beaulieu, 1990).
However, this problem concerning Boolean search and exact match is not dealt with in the book, the preface of which this quote is taken. Hancock's quotation expresses a common criticism of Boolean technique. The question is, however, how well documented this attitude is (cf., Turtle & Croft, 1998).
"Extended Boolean" is a method that combines a weighting system (best match) with Boolean technique in such a way that the documents/representations that fulfill all demands in the search profile get the highest ranking.

Literature:
Bednarek, A. R.
(1970). Boolean Algebras. IN: Encyclopedia of Library and
Information Science. Vol. 3. Ed. by Allen Kent & Harold Lancour. New York:
Marcel Dekker, 88-98.
Hancock-Beaulieu, M.
(1990). Foreword. P. vii in: Ellis, David: New Horizons in
Information Retrieval. London: Library Association.
Davis, Charles H: Beyond Boole: The Next Logical Step.
Bulletin of the American
Society for Information Science, 1995, 21(5), 17-20.
Paris, L. A. H. & Tibbo, H. R. (1998). Freestyle vs. Boolean: A comparison of partial and exact match retrieval systems. Information Processing & Management, 34(2-3), 175-190.
Salton, Gerald; Edward
A. Fox & H. Wu: Extended Boolean Information Retrieval.
Communication of the ACM, 1983, 26(11), side 1022-1036.
Salton, Gerald; Edward
A. Fox & E. Voorhees: Advanced Feedback Methods in
Information Retrieval. Journal of the American Society for Information Science,
1985, 36(3), 200-210.
Thiel, C.
(1991). Boolean Algebra. IN: Handbook of Metaphysics and Ontology. Vol.
1-2. Ed. by Hans Burkhardt & Barry Smith. Munich: Philosophia Verlag,
Vol
1, pp. 96-98).
Turtle, H. & Croft, B. (1998). Boolean query languages are not dead. http://www.cpe.ku.ac.th/~arnon/Mirror/ir-p/ir6/sld011.htm
Valery I. F., Shapiro, J.; Taksa, I. & Voiskunskii, V. G. (1999). Boolean Search: Current State and Perspectives. Journal of the American Society for Information Science, 50(1), 86-95.
Wikipedia, the free encyclopedia (2006a). Boolean logic. http://en.wikipedia.org/wiki/Boolean_logic
Wikipedia, the free encyclopedia (2006b). Venn diagram. http://en.wikipedia.org/wiki/Venn
Birger Hjørland
Last edited: 31-01-2007