The relationships between entities, words, and how people searchTo understand how Google currently approaches parsing content and understanding what content is about, you would have to know how Google is putting a ton of effort and a lot of money on things like neural matching and natural language processing, which seek to understand when people talk and what are they talking about. This ties back to the evolution of search being more conversational. They are spending a lot of time trying to understand the relationships between entities and between words and how people use words to search.
Understanding salienceA core component of natural language processing is understanding salience. Salience is a one-word way to sum up to what extent a piece of content is about a specific entity. Google is really good at extracting entities from a piece of content. Entities are basically nouns, people, places, things, proper nouns, regular nouns. When Google is saying, “Okay, here are all of the entities that are contained within this piece of content,” salience attempts to understand how they’re related to each other, because this is what Google is really trying to understand when they’re crawling a page.
Natural Language Processing (NLP) APIsFortunately, there are now a number of different APIs that you can use to understand natural language processing:
- IBM has one: https://www.ibm.com/watson/services/natural-language-understanding/
- Google actually has a natural language processing API that’s right here on https://cloud.google.com/natural-language/
You can test it out by putting in a piece of content to see (a) what entities Google is able to extract from it, and (b) how salient Google feels each of these entities is to the piece of content as a whole. Again, to what degree is this piece of content about this thing?It is essential for SEOs to understand that salience is the future of related keywords. We’re past keyword selections to optimize for chocolate chip cookie recipes that would include chocolate recipe, chocolate chips, or things like that. Instead, we need to understand the entities that Google is using, such as Freebase, where Google sees these entities co-occur at such a rate knows it can feel reasonably confident that a piece of content on one entity is salient to the other.