Google says the ‘substantial’ group of site owners will take care of the crawling budget, but the ‘massive majority’ doesn’t need to.
In the latest episode of the ‘Search Off the Record’ show, Google’s Search Relations department says most sites don’t have to think about the crawl budget.
Google’s Gary Illyes addressed the subject at length, saying the team has pushed back on their previous messaging and noting that a ‘substantial group’ needs to be worried about it.
On a majority of sites, however, the crawl budget should not be a problem, Illyes explains
“Historically, we’ve been pushing back the crawling budget, usually telling people you don’t have to worry about it.
And I am standing my ground and I’m still saying most people don’t have to worry about it. We do assume there is a large part of the environment that has to take care of it.
… but I also maintain – I’m trying to explain that here – that it doesn’t matter to the vast majority of the population”.
Then who should take care of the budget crawl and who shouldn’t?
When to Care About Crawl Budget or Not To
SEOs usually want to hear a difficult number when it comes to budget crawls – like the site needs to have X-number of pages before budget crawls become a concern.
“Well, this isn’t quite like that. It’s like you can do stupid stuff on your website and then Googlebot starts to crawl like crazy. Or you can do some other kind of stupid stuff, and then Googlebot’s going to stop crawling all the way.”
If forced to give a number, Illyes says that about a million URLs is the baseline before the site owner really needs to take care of the crawl budget.
Factors Affecting Crawl Budge
For sites with more than a million URLs, these are some of the factors that could lead to budget issues.
Factor 1: Pages that haven’t been crawling in a long time
“What am I going to look at? Probably the URLs that have never been crawled. That’s a good indicator of how well the site was discovered, how well it crawled …
So I’d like to look at pages that were never crawled. You will want to look at your server logs so the utter truth can be offered to you.”
Factor 2: Widespread improvements over a long period of time
“And I will still be looking at the refresh rates. Like if you see that some parts of the site haven’t been refreshed for a long time, say months, and you’ve made changes to the pages in that section, then you probably want to start thinking about the crawl budget.”
Fixing the crawl budget issue
First, try deleting the non-essential pages. Each page that Googlebot has to crawl reduces the budget for other pages to crawl. As a result, an excessive amount of “gibberish” content could lead to important content not being crawled.
“And if you remove unnecessary pages from the crawl then google will have time to focus on important pages which are very valuable for users.”
Illyes’ second recommendation is to stop sending “back-off” signals to Googlebot. He said,
“Back off signals are certain server codes that tell Googlebot to quit crawling the web immediately. If you send us back off signals, that’s going to influence Googlebot’s crawl. And if your servers can handle it, then you want to make sure that you don’t send us 429, 50X status codes, and that your server responds quickly and easily.”
Best SEO Company | Digital Marketing Services | Best SEO Services Company | Ecommerce SEO Agency | Website Audit Services | Google Penalty Removal Service | Local Search Engine Optimization Services | PPC Company Australia | App Store Optimisation Services | Content Creation Agency | SEO Agency Australia | Sydney SEO Services | Technical SEO Consultant | SEO Services in Melbourne | SEO Consultant Perth | SEO Consultant Brisbane | Image Optimization | Importance of Digital Marketing | What is Cloaking | What is YouTube SEO