Essentially, all keywords have a certain density on landing pages.
There are several sites in the TOP10 of the SERPs that we must discard from our recommendations as they may be there due to other factors, for example; Wikipedia or YouTube.
For example, say we are writing an article for a blog and within the TOP10 for the keywords, we are using, one of the pages is from YouTube with only 50 words. It makes no sense to include this page in our recommendations. If we were to replicate this type of page, with only 50 words, we would never see the page in the TOP10 because the video is ranked by other factors, and is added to the TOP10 of the SERPs to give variety to the content offered.
And there's a second problem. Imagine we are optimizing our page for the extreme value of the keyword density. If, after one day, the site that had this extreme keyword density value drops out of the TOP10, the range of word density for sites in the TOP10 is reduced. Leading to the system yet again asking you to redo your page.
Therefore, it is much easier to remove the extreme values from our recommendations.
Here's, a simplified graph that shows how many sites in the TOP10 contain what density of this word.
When we need to place several key phrases on one page, we must find the overall confidence interval for each word within these key phrases.
For example, if we want to place the phrases "mobile phones" and "mobile traffic" on one page, then we need to find the overall confidence interval for the word "mobile", which is included in both key phrases. `
If the text for these words on the pages in the TOP10 are similar, then the overall confidence interval will be found:
If the text for these words on the pages in the TOP10 are very different, then there will be no overall confidence interval:
In this case, it is undesirable to use the average value - since this density will not be enough for one key-phrase and will be too high for another key-phrase. Meaning there's a risk of being lowered in the SERP for over-optimization.
Remember that we are calculating what density we need to use for a word because this word is part of several key phrases. And if the density doesn't fit, then one of the phrases on the landing page will have less chance of getting into the TOP10 of the SERP.
There are several reasons that can lead to this situation:
Finally, an explanation on why you should never use averages. Below shows a graph of the distribution of keyword density on sites in the TOP10 that are often found:
To provide variety, the search engine often adds sites with different types of content in order to best satisfy the user. For example, an overview site may be added to the search results alongside online stores.
This is often less noticeable as a user but is very different when analyzing the numbers to get an average. Let's say there are 9 blog sites in the search results and 1 site with a huge price list that has a high keyword density. This will shift the averages away from the majority group, skewing the results.
It is also important to note that a page doesn't always need to be made exactly in accordance with what we see in the TOP10.
Firstly, there are categories of sites that lag behind others in terms of the quality of features and content offered. In cases like this, it is more useful to improve the site rather than follow what others have done, in order to get ahead of the competition.
Secondly, if there is a lot of competition, doing something different may be the only thing to set your site apart. However, at the same time, you must clearly understand how the site should be produced, to give you a chance of reaching that TOP10 in Google.