Latent Semantic Indexing for dummies.

Have you ever been amazed by the results Google gives you for a query? Do you think Google just lucky-guesses the answers to your questions? Is it omniscient? Or is there a deeper mathematical algorithm which uses certain methods and processes your requests?

We may be onto something.

Search Engine Optimizers have been talking about latent semantic indexing for quite some time now. Is it real or just fantasy? Since there are no real clues from Google, but only pieces of data, we have tried to give a deeper insight into the topic and determine whether or not computer semantics processing can help you rank better in the SERP.


What is Latent Semantic Indexing? (LSI)

Latent Semantic Indexing (LSI) comes from Latent Semantic Analysis (LSA), a technique in natural language processing which subsumes programming computers to process and analyze bulks of natural language. Basically, Natural Language Processing (NLP) helps connect human language with artificial intelligence (AI) via speech recognition, natural language understanding, and natural language generation. LSI, therefore, is the application of all these approaches to information retrieval, whereas its fundamental problems are synonymy (words with similar meaning) and polysemy (a plurality of meanings within a word or term).

LSI uses an indexing and retrieval method in which a mathematical technique (Singular Value Decomposition, SVD) is used “to identify patterns in the relationships between the terms and concepts contained in an unstructured collection of text. LSI is based on the principle that words that are used in the same contexts tend to have similar meanings. A key feature of LSI is its ability to extract the conceptual content of a body of text by establishing associations between those terms that occur in similar contexts” (Wikipedia).

semantic taxonomy of entities and attributes moz
Semantic taxonomy of the Simpsons. (Image source: Moz)

This means that with the help of this method, a computer algorithm can process and make connections between documents (texts) based on the use of synonyms and the connection thereof.


About more than a decade ago, some SEOs have noticed that there is a connection between LSI and search engine optimization, and ever since the term became a buzzword whose existence can be denied or praised like the God.

Recently, Search Engine Journal published an article by Clark Boyd in which he tried to prove that LSI does not help SEO and even argued that there is not enough evidence or research which supports the connection between the two. He concluded, however, that the concept of LSI keywords would probably not harm your optimization efforts, but could “start to erode trust and lead the way to more fallacies in future.”

So, is there any truth in his words?

Nevertheless, the connection between the two notions can be seen in Google Search features, which highlight the use of structured data that help the search engine understand the content of a page and apply content-specific features to present or highlight additional data about your query.

This means that Google indeed uses some sort of an advanced mathematical algorithm based on the principles of semantic search, structured data, as well as latent (hidden) analysis of the meaning of words (semantics) to make connections between them. Therefore, the algo decomposes the content of a document and attaches meaning to content words (LSI keywords, as some call them) which are further analyzed and parsed, and to which meaning is attached within a certain context.

If this sounds quite vague to you – that is completely fine, because you are not a computer engineer, and nor are we. The main point is that Google does use principles of semantic search and you can apply them to your website.

Here is how.

Semantic search

Semantic search principles can help you in terms of keyword research and generate content around those keywords.

Semantic web entails the following things (as recommended by Google):

All of these combined with markup, as recommended by our SEO experts, will help your website get to top of Google search. But be careful with marking the data.

What you need to pay attention to is to research keywords with a tool, such as LSI Graph, Keyword Tool, or Keyword Planner (mind you that Google announced a rebranding of the AdWords platform, which in turn may influence the changes in KWP as well) and use the findings for your website content.

Mind you that this is only the beginning. If you are into this area of optimization, there is more to learn every day. If you already know something about this or would like to recommend a learning resource or important findings aofLSI and the semantic web, please write your comment in the section below.

Like this read? Here is more about SEO:

Vesna Savić

Dedicates her time to learning about better means of communication, translating knowledge into practice, and is a passionate reader.