Learn techniques to get stakeholders on board with your project 👉 Free Webinar 🌟

Content Creation

Statistics that make your content more findable

Andreas Voniatis • 4 minutes

Having audited over 3,000 sites using machine learning, we’ve found that consumer search behaviours and ranking factors vary across different industries and countries.

So much so, that there are fewer general rules of SEO and content strategy that are hard and fast. However, the basics never change and in this article, we share some interesting stats that go beyond the generic advice, that might help convince your colleagues or your clients to make changes to optimise your web content.

Know thy user intent

One of the challenges facing content strategists and SEOs alike is knowing the answers to questions such as:

  • Which pages should we map our keywords to?
  • How do we know if the user wants the same thing from Keyword A and Keyword B?
  • If we get this right, how much engagement, conversions and Google rankings do we stand to gain?

The answer can be found by computing the search intent between keywords. There are several providers in the SEO industry that attempt to do this already using the TF-IDF (it stands for Term Frequency-Inverted Document Frequency). In essence, TF-IDF strips out the stop words (like ‘the’, ‘to’, etc) and compares the proportion of rare words between web pages, which helps users determine how similar the pages are. It’s not perfect as it doesn’t take into account consumer behaviour, but it’s a start.

The heatmap generated by ARTIOS technology above shows how consumer attitudes compare when searching for wedding venues. Where the tiles are a darker shade of blue, the keywords have a new identical ‘search intent’, for example ‘wedding venues Birmingham’ has the same intent as ‘luxury wedding venues Birmingham’ but not with ‘cheap wedding venues Birmingham’. The machine learning based algorithm is effectively looking for search intent with 80% or more similar based on factors beyond TF IDF, such as content layout, images used etc.

This is visualised further below, using machine learning to map keywords, shown as groups below:

Although luxury and general wedding venues in Birmingham have a similar intent, it can be seen that this is not the case in Richmond. So it cannot be taken for granted that people will have the same attitudes towards different cities.

Statistically, we’re generally aiming for 80% similarity in search intent.

Naturally, stakeholders will want to know what the benefit of optimising content is on the website. An example is given below:

The graph shows a number of websites operating in the wedding planning ecommerce affiliate space, showing their Google rankings vs their IA (Information Architecture) Score. The IA Score measures the extent to which their site content structure is well matched to the ideal search intent. The size of the circles are the estimated clicks per month from SEO based on the click through rate (based on the ranking position), and the monthly search impression volume.

We can see that an optimised version of Bridebook’s content mappings would increase Google rankings by 12 positions on average and effectively double the number of SEO clicks to 3,000 per month based on just 14 keywords alone. This makes sense as IA is not the only component of SEO, due to other factors like links from authoritative sites and user experience for example.

Staying close to the home page

Advice about having all of your content to be reached within 2 clicks of the home page is nothing new. The graph below, shows the average Google rank of the site pages (for their target search phrases), along with their variation (measured by the standard deviation in grey dashed lines) against the site level. The home page has a site level of 1, so any page directly linked to from the home page would have a site level of 2 and so on.

The graph above shows that moving pages within 2 clicks from the home page will increase Google rankings by 25 on average. This comes with caveats of course as the statistics will vary depending on the nature of the search query such as the long tail. For example, ‘buy black versace jeans washed denim’ will inevitably be further down the site hierarchy, so with the best will in the world, further analysis should be down to take into account the search string tail length.

A thousand words

A picture may tell a thousand words, however we often find that a thousand words is generally the required word count to maximise searchability in Google. The graph below shows that there is an average of 5 ranking positions to be gained from increasing the word count from 560 words to over 1,000.

An average gain of 5 positions may not seem much, however in competitive industries the small gains could make quite a difference in traffic. Especially, if we’re comparing the first page with second page, of the search engine results.

Keep it quick

It’s generally accepted that website pages that load faster on both mobile, tablets and desktop computers will increase searchability in Google. But by how much?

The chart below plots the individual Google Pagespeeds desktop scores and their Google rankings for their target keywords:

From the chart, we can see that sites generally have 20 ranking positions on average, to gain in Google – when they achieve scores of 90 or above. That’s all very well, but how do you increase Pagespeeds?

The chart below shows a statistical analysis of the different components that make up the Pagespeed scores across an example website. The R-squared values are also displayed.

The R-squared is a predictive metric which tells us the extent by which the Pagespeeds scores can be explained by the component for example, number of Javascript files (Number JS Resources – Desktop), scores 0.02. This means that the number of Javascript files proportion (measured as a percentage) found on site pages can only explain about 2% of the variation in Pagespeed performance.

Image Response Bytes by contrast can explain 96% of the variation in Performance, so in this case, the website developers should prioritise the SEO recommendation to increase Pagespeeds by reducing the image file sizes.


The statistics and their benchmarks can be applied to any industry which are summarised below:

Do note however, that the ranking gains will inevitably vary by industry and country targeted – even for the same industry targeted across two countries.

Have benchmarks which are statistically proven to increase rankings, but keep in mind that the gains will vary. Some machine learning analysis would also be required to work out what that gain is.

If you’re working on a large ecommerce site, then using algorithms make sense as they will outperform a human professional. For example, when optimising search intent, even if the marketer has an excellent command of English (or whatever language they’re working in), is highly knowledgeable about the client industry, and is an expert in content strategy and SEO. Even on a small website, machine learning would still make sense, as the output is more likely to be reliable and won’t suffer bias based on personal experiences of the marketer.

On a wider perspective, these statistics demonstrate how machine learning can help make SEO and content strategy more predictive.

Statistics that make your content more findable

Free Online Masterclass

Content Strategy & Delivery

Kristina Halvorson

Anyone who struggles with content consistency and efficiency—and who doesn't?—will greatly benefit from this course. If you want to get smarter about content planning, creation and delivery, start here.

Kristina Halvorson

CEO & Founder, Brain Traffic

Reserve my place

About the Author

Andreas Voniatis

SEO Scientist, Artios

You might also like these posts…

© GatherContent 2019 WeWork, 2nd Floor, 115 Mare Street, London E8 4RU, United Kingdom. VAT No.: GB140105279. Company No.: SC400199