7 Ways to Improve the Site-wide Crawl Frequency of Your Site
You might always have wanted Google to index your new blog posts the second they’re live, but it doesn’t simply happen. While expecting googlebot to permanently reside on your site is a bit unrealistic in nature, you can still make use of various ethical ways that’d make googlebot come back to your site often, and not only that, also get the new pages on your site indexed quickly, even if you’re not the New York Times or Mashable.
So, without further adieu, here are some ways you can improve the site-wide frequency of crawling of your site:
1. Share Your New Contents on Google+
A lot of people have been telling me that they’ve managed to get new pages indexed pretty quickly, even if they’re from fairly unpopular sites, by just sharing them on their Google+ profiles.
The theory is actually interesting. It says that as Google+ is a platform directly owned and operated by Google themselves, they can access recently posted data on the platform better and faster than any other social networking or bookmark sites. So, it suggests that they actually use data from Google+ to find newly posted contents on the web faster.
2. Maintain A Regular Posting Frequency
Studies suggest that googlebot actually crawls sites based on their activity trends. So, for a website that gets updated 100 times a day, googlebot will make sure to crawl that more often than a site that gets updated only once per day or once per week.
Maybe, Google’s advanced machine-learning mechanisms play some role here, and they actually learn the behaviours of websites and then act on them (crawl them) based on that.
I’ve actually tested this theory on many of my sites, multiple times in the past. Even small changes like switching from a weekly to a bi-weekly posting schedule resulted in increased crawl rates.
3. Make Your Site More ‘Crawlable’
What I mean by this, is fix crawl errors, make your website faster, optimize various performance areas of your site, and everything else that you can possibly think of to make the crawling ‘job’ easier for search engine spiders.
Noticing tons of server and DNS errors in Google Webmaster Tools? Now might be a good time to switch servers. On the other hand, if you see a bunch of 404 errors, you should focus on actually fixing them.
The main thing is to encourage googlebot to crawl your site more often.
- Indexation and Accessibility – The Advanced Guide to SEO
- The Basics of Search Engine Friendly Design and Development
4. Build ‘High Quality’ Links to Your Site
Matt Cutts is actually right about high quality backlinks improving the crawl rate and thus the indexation speed of your site. He said many times in his webmaster help videos that Google basically controls the frequency of crawling web pages according to their PageRank.
So, as low quality but high PageRank links have the chance to get your site penalized, you should aim for at least decent if not high quality links to influence the crawling frequency of your site.
5. Try to Get Social Shares
While there’s no evidence that social shares directly influence search rankings, according to my personal experience they do help new contents of a site get indexed very quickly.
Sites like Facebook and Twitter don’t allow spiders to crawl all of their pages containing fresh content. For example, Facebook doesn’t allow bots to crawl anything that’s not publicly available (and that makes sense). Similarly, Twitter doesn’t allow bots to crawl their real-time search results and any kind of search results in truth. You can verify this by checking their robots.txt files.
Even then, crawlers such as googlebot and bingbot are able to crawl people’s profiles on these social networks and access publicly available information. So, they’re still able to find a link you’ve recently shared on your Twitter account, or shared publicly on your Facebook account. So, getting a decent amount of social shares for your contents do help to get crawled and indexed faster.
Social bookmarking sites such as Reddit, Digg, and StumbleUpon also help in the process. In fact, Reddit in particular, is a huge source of fresh content that people find interesting to the search engines, especially because it’s easy to crawl, thanks to its structure and open nature.
6. Unlinked Mentions & Co-citations Help, Too…
People tend to care about links so much that they almost ignore unlinked brand mentions and co-citations. The reality is that they’re signals of a real brand.
A real brand has unlinked brand mentions alongside linked ones. You might have noticed that I have mentioned quite a few social sites in this post but I didn’t link to them, because people are already familiar with them and I don’t have to remind them the URLs of those sites.
A real brand also has decent co-citations, i.e. textual contents placed around the links to them. So, say TechTage has been doing a good job posting SEO related articles and guides. So, it’ll obviously have links to the homepage, as well as internal pages, and those links will be surrounded by blocks of textual contents about SEO.
What that does is, prove to Google TechTage’s topical authority when it comes to SEO. This isn’t confirmed, but it’s known that Google has been working on ways to determine and utilize the topical authorities of sites to provide better search results. Again, there’s not evidence to this at the moment and it’s based on my experience.
7. Post Unique Content
Lastly, you have to post unique content. Before you shout at me saying, “you’re just a younger version of the stupid Matt Cutts”, let me explain the relation between unique content and a site’s crawl rate.
What I meant by unique content is a piece of content not scraped off other sites or duplicated from another site. Google has got really smart in the last few years in detecting duplicate and scraped content, so I wouldn’t suggest you to post even manually re-written content.
Now, why is that bad for your site’s crawl rate? Well, that’s because over time, Google will identify your site as a copy-paster, that doesn’t add something new to the table. Then they’ll slowly reduce the crawl frequency and if they notice no improvements on your side, they’ll stop crawling it altogether. This is opposite to what happened at the time of the Caffeine update. Now, even for popular search terms, the amount of total search results (displayed by Google) has been reduced by as much as 20-50%.
So, they’ve definitely found it useless to index sites that just publish re-hashed content. On the other hand, posting unique content not found anywhere else on the internet will make Google more interested in your site, and subsequently, the crawl rates will improve as well.
Having a great crawl frequency can have many advantages. With your newly published contents getting indexed almost as soon as they’re published, it’ll mean more search traffic and chances of building an initial buzz around your content, potentially making it viral.
So, what other techniques do you leverage to improve the crawl rate of your website?