What is crawl budget in SEO

31/08/2021
  • Let´s analyze a critical concept to understand how SEO works and how search engine bots interact: the crawl budget.

  • Today, we are going to analyze the crawl budget, a key concept in SEO.

    Once you are clear on what it is and how it works, it will be much easier for you to understand why the indexation of one site or another takes a different rhythm, with a totally different crawl frequency.
  • What is crawl budget?

  • Before explaining how it affects our positioning, we should explain what the crawl budget is

    Let's say that search engines do not give us unlimited attention. A "budget" is granted according to the relevance that the algorithm determines and the capacity that our site has to be traversed without generating errors.

    The latter is very important: we must not cause bottlenecks due to an inefficient information architecture.

    Therefore, we must be clear that Google will only dedicate a crawl budget to us in a specific period of time. When that period is over, it will stop and move on to another website until the next visit.
  • All URLs (pages, CSS, JavaScript, PDF...) that exceed the assigned crawl are content that is not crawled. This implies that not everything is indexed properly. It is not silly, is it?

    Remember that bots crawl your website just as a user would, going from one link to another.

    Therefore, you must take into account that the way of structuring the information is logical and that you do not generate, for example, reflexive links that do not contribute anything to navigation or, what is worse, what we could call "trap links", such as those that can generate a calendar unlimited in time to book appointments (a site in which the bot can be "hooked" very easily and give up without seeing what it should).
  • How can I know the crawl budget of my eCommerce?

  • The truth is that Google does not usually bother to give details of its inner workings. In fact, it often does not even deign to confirm an algorithm update, nor has it ever informed us of the exact weight of each ranking factor.

    In the case of the crawl budget, it is not that it does not inform us exactly if we have been awarded 1 or 10. But there is a way to get a rough idea of what it is through Google Search Console (you know, the console to centralize the information of your website with respect to Google).

    We assume that you already have the property registered. If not, you should do it right away.

    Then, you will need to access the crawl stats report found under Settings > Crawl Stats path. In addition to a huge amount of information about tracking requests broken down by: 

    • Response type: pages responding with 200 (ok), 400 errors and redirects.
    • File type: HTML, JavaScript, Json...
    • Purpose: here we can know how many of the pages have been re-crawled and how many have been discovered in that period. Interesting to know if the crawl budget is enough to reach the new resources you have added.
    • Type of robot: there are two types of crawlers that we are particularly interested in: desktop and mobile. Pay special attention to the latter, as you know that SEO has shifted towards Mobile First Index.

    With all this you have enough information, but it is not the best way to guide you regarding the crawl budget. To do this, let's look at the graph that appears first in the report.
  • Here you can see how many requests the system has processed in the last 90 days. In the example you can see here, it is a small page that varies a lot: it goes from 5 to peaks of more than 70.

    To have a little more concrete data, the best thing to do is to export this data and work on it in a spreadsheet. You simply have to obtain the daily average because, as you can see, it is not at all linear.

    In the case of the example we are discussing, there are 1,129 which, divided by 90 days analyzed, give us an average crawl budget of 12.5. These are the resources that are traversed on a daily basis.
  • It is true that, as we said, it is not linear. This is because the algorithm evaluates the crawl needs according to:

    • Level of popularity/relevance of the pages (internal links, external links and number of searches for which it ranks).
    • Updating or freshness: the more often the content of a URL is updated, the more likely the search engine is to review it. This, indirectly, also depends on the type of page.
  • Is your eCommerce crawl budget under control? Do you think you could optimize it? Share yuor thoughts with us.

  • Images | Unsplash, Google Search Console.

Jordi Ordóñez


Jordi Ordóñez is an eCommerce and SEO consultant with more than 16 years of experience in online projects. He has advised clients such as Castañer, Textura, Acumbamail, Kartox or Casa Ametller. Write in the official blog of Prestashop, BrainSINS, Marketing4ecommerce, Photography eCommerce, Socialancer, eCommerce-news.es and SEMRush among others. He is an editor on the Oleoshop blog.

search posts

Last posts

This website stores data as cookies to enable the necessary functionality of the site, including analytics and personalization. You can change your settings at any time or accept the default settings.

cookies policy

Essentials

Necessary cookies help make a web page usable by activating basic functions such as page navigation and access to secure areas of the web page. The website cannot function properly without these cookies.


Personalization

Personalization cookies allow the website to remember information that changes the way the page behaves or the way it looks, such as your preferred language or the region in which you are located.


Analysis

Statistical cookies help web page owners understand how visitors interact with web pages by collecting and providing information anonymously.


Marketing

Marketing cookies are used to track visitors on web pages. The intention is to show ads relevant and attractive to the individual user, and therefore more valuable to publishers and third-party advertisers.