Vector space model (VSM)

The vector space model is an algebraic model used for information retrieval. In VSM are natural language documents represented in a formal manner by the use of vectors in a multi-dimensional space. It was developed by Gerard Salton in the 1960's and used in the so-called SMART system. 

 

Salton's classic weighting is given by the following equation:

Term Weight =    Term Vector

where

Many models that extract term vectors from documents and queries are derived from this equation.

 

 

 

 

Literature:

 

Kemp, C. & Ramamohanarao, K. (2002). Long-Term Learning for Web Search Engines. In Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD 2002). Berlin: Springer-Verlag. (Pp. 263-274). Available at:  http://web.mit.edu/~ckemp/www/papers/KempRao.pdf

 

Salton, G. (1968). Automatic Information Organization and Retrieval. New York: McGraw-Hill.

 

Salton, G.; Wong, A. & and Yang, C. S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613-620.
 

Raghavan, V. V. &  Wong, S. K. M. (1986). A critical analysis of vector space model for information retrieval. Journal of the American Society for Information Science, 37(5),  279-87. Available:  http://66.249.93.104/search?q=cache:ZPI5kJOzndsJ:www-ufrima.imag.fr/FORMATION/FILIERE/MASTER/SI/SiteMasterSI/Documents/MORI/raghavan.pdf++%22critical+analysis+of+vector+space%22&hl=da&gl=dk&ct=clnk&cd=4

 

Wikipedia, the free encyclopedia. (2005). Vector space. http://en.wikipedia.org/wiki/Vector_space

 

http://www.kuropka.net/files/TVSM.pdf

 

See also: Latent semantic indexing

 

Birger Hjørland

Last edited: 13-05-2006

Home