Latent Semantic indexing technology

[复制链接]
查看: 891   回复: 0
发表于 2005-3-17 21:55:39 | 显示全部楼层 |阅读模式
In the latest Google update Google may have also rolled in some
Latent Semantic indexing technology.

LSI is described well here:
http://javelina.cet.middlebury.edu/lsa/out/cover_page.htm
and here is a snippet of how LSI works.
Latent semantic indexing adds an important step to the document
indexing process. In addition to recording which keywords a
document contains, the method examines the document collection
as a whole, to see which other documents contain some of those
same words. LSI considers documents that have many words in
common to be semantically close, and ones with few words in
common to be semantically distant. This simple method correlates

surprisingly well with how a human being, looking at content,
might classify a document collection. Although the LSI algorithm
doesn't understand anything about what the words mean, the
patterns it notices can make it seem astonishingly intelligent.

Some people debate to what extent LSI is being used, but
eventually that is the way things are heading. Generally though
if you write somewhat naturally and mix your anchor text with
various sematically similar terms you should do fine.

If you have a site that will naturally acquire a bunch of links
using a particular achor text variation you may want to try to
avoid using that variation too much when you are building links.

On top of adding the link to the LSI site I also added a link to
the Topic Sensitive PageRank research paper as I felt I probalby
should link to it as well as it is certainly an interesting and
useful research paper.
http://www.stanford.edu/~taherh/ ... e-pagerank-tkde.pdf
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则