Google Indexing Scrapers First?

By Daniel Scocco

Yesterday I published a guest post from Abhijeet Mukherjee titled Do You Know Your Visitors? 5 Points to Consider. A couple of hours later Abhijeet messaged me on Gtalk to let me know that Google was not indexing my backlinks to his blog, but rather the link from a scraper site that had copied part of the post.

This made me curious and went to check for myself. The first thing I wanted to know was if my post was indexed already by Google or not. I copied one sentence from the post and search it in Google, with quotation marks to find only exact matches. The result was pretty surprising: Google had already indexed 2 scraper sites, but my original post was not on their index yet, as the image below illustrates:

google indexing scrapers?

I repeated the search query today, and my post is now showing on the first position. Regardless, I find it pretty weird that Google would index first scraping material and only afterwards the original source. The same thing was happening to the indexation of the backlinks.

Anyone know what could be the cause for this flaw?



Share

49 Responses to “Google Indexing Scrapers First?”

  • Kasper Larsen

    I would say its because the skraper sites is being indexed more often by Google because of they constant traffic. That is also why your site now are placed where it was supposed to be and the skraper site has been removed. I think Google’s technology is just not that good that it can see what sites really gets heavy traffic and what sites are just skraping material and gets the high ranking in the beginning.

  • mostafa

    For some reason, Google stopped indexing my posts and I can’t figure out what’s wrong. The only thing that Google indexes are category and page views, but none of the individual posts / permalinks show up in any of the search results. Even if I use very specific word-for-word with quotation search, nothing shows up. This only happened in the last 1-2 weeks and it’s driving me crazy.

  • tony

    nice tips from the contributors to this site i have learned something new on how google index pages especially the area of submitting sitemaps as this will speed up indexing your page.

  • Susan

    For some reason, Google stopped indexing my posts and I can’t figure out what’s wrong. The only thing that Google indexes are category and page views, but none of the individual posts / permalinks show up in any of the search results. Even if I use very specific word-for-word with quotation search, nothing shows up. This only happened in the last 1-2 weeks and it’s driving me crazy.

    Do you have any idea why that might be? Other search engines like Yahoo, Live, Ask, etc… work fine.

  • Richard

    I think it might have more to do with how often you update your site. The Google Search Appliance that many companies use to index content on their intranets automatically determines how often to crawl a URL based on how often that URL has changed in the past. I don’t know if this is what is happening in your case, but it’s likely that the content on the scraper sites change more often than the content on your site, so the Google-bots don’t check your site as often for changes.

  • Bluetooth

    Yes I am completely agree with SEO and WordPress Design, but generally the new content is index by content above the already index site for similar content for few days only later on as soon as the traffic and backlinks to that scrapper and content diminishes it gradually get down to its original position.

    You can take example of ezine articles. As soon as your publish new article it is index top for few days then after that no where.

    Selena

  • Salif

    We are having the exact same problem with one of our sites. In fact, we think that we may have been penalized because of it. In some cases, our content is never listed, but the scraper’s still stays in Google. This is very frustrating-to have scrapers get their pages listed before your original content is listed.

  • Dingexx

    i did not know about it..thanks for the info guys..It can help me a lot..

  • romano

    Same problem but with social bookmarck : i see my my blog serps with my post, under tecnorati or digg…

  • boostranks

    Sorry if this is covered already, I haven’t read all the comments. Next time it happens, let google know via the dissatisfied link at the bottom of the search result. But seeing that you got to number one eventually, I don’t know if there is anything they can do.

    I have a blog I post to a few times a month, last time I checked, it took a few minutes before my post got in the index.

  • Jarkko

    Karl, there are already some sophisticated blog comment scrapers, advanced to the degree that they slightly morph original wording. 😉

  • Daniel Scocco

    @Loius Gross, exactly, I also think they would need to balance the algo here.

    @Aseem, got it.

    @Joe, yeah apart from the indexing time I also noticed that even after 1 year scraper sites were still outranking me for very narrow search queries. I wrote about it here:

    http://www.dailyblogtips.com/google-ranking-scrapped-material-on-top/

    @Stephan, I suppose the same could happen with Squidoo and company, but here you have the authority factor.

    @Erik, I think think they should delay the indexing of pages for some minutes, as to get a better picture of what is going on.

    @Ben, yeah thankfully after a while it gets fixed, but it is annoying nonetheless, and I suspect the effect could last longer for blogs with less juice.

    @Karl, it would be difficult to find a scraper site with many backlinks.

    @Ivan, that is not always the case, the system is still flawed for exact search matches, check my link above.

    @SEO and WordPress Design (what a name…), Yeah the timing issue is clear now, still I think it is flawed.

  • Karl Hardisty

    Which sounds suspiciously like what someone said above

  • SEO and WordPress Design

    The reason is simple. Scraper sites scrape several sites, which means they update several times a day, compared to a regular blog which updates usually once a day.

    The more frequently a site is updated the more frequently it is crawled by the search engines.

    And also the fact that your site is ranked above scraper sites doesn’t mean that Google considers you the original source. For example Technorati blog pages rank usually higher than the original posts.

Comments are closed.