How do I Handle Content Scrapers? Can They Hurt My Rankings?

Daniel Scocco

questions and answersThis post is part of the weekly Q&A section. Just use the contact form if you want to submit a question.

Arun Basil asks:

Daniel,
Recently I had been getting some backlinks to my articles from sites that look like genuine sites. These backlinks comes within about 2/3 hours of posting content. My blog is not a very popular blog, so i dont think that the guys found my latest posts anywhere online (like Google or Social sites). I used to think that, these bloggers would have found my posts while random surfing. But then, these things happen very often now, that too from different sites. And the good thing is that, these sites publish only excerts from my blog and a link to the main article. But I do not get visitors from any of these sites.

My questions are:
1. How did they find my post within 3 hours of posting the content?
2. Will backlinks from spam sites affect my rankings?
3. Should I ask them to remove links to my site?
4. If someone else publishes excerpts from my blog, will Google consider them as copies of the same content?
5. These sites have page ranks of 0 or 1. Will link backs from such sites help me improve my PR..?

It looks like you are talking about content scrapers. Those are people that create websites on specific niches, and for the content part they just scrape other blogs or sites around the web. One method to scrape that content is via the RSS feed of blogs. There are many plugins and scripts that will automatically grab an RSS feed and output its content as new blog posts.

Scrapers who republish 100% of the content that they find on other websites are obviously violating copyrights, and you could try to bring them down. Scrapers that only republish excerpts, however, are probably protected under the “fair use” clause, so there isn’t much you can do about them (except forcing them to link back to you, as I will show below).

Now let’s answer the 5 questions.

1. How did they find my post within 3 hours of posting the content?

As I mentioned before, it is likely that those pseudo blogs simply added your RSS feed to their script, so every time you publish a new post they will get notified about it, and the script will automatically write about your post on the scraping blog (either with an excerpt or with the full content).

2. Will backlinks from spam sites affect my rankings?

If you mean affect your rankings negatively, the answer is no. External links will almost never hurt your search rankings. This is a necessary measure for Google and other search engines, else it would be too easy to sabotage competing websites.

Notice that I said “almost never,” however, because under some situations the external links could end up hurting a site’s ranking. But here I am talking about elaborate linking patterns that have the purpose of simulating the manipulation of Google’s index or spam activities. In order words, this would only happen if you have an expert SEO trying to hurt your rankings deliberately, and not as a result of content scrapers.

Linking out to bad neighbor and spam websites can hurt you a lot, though, so keep an eye for the pingbacks and trackbacks that those sites will send to you.

3. Should I ask them to remove links to my site?

As long as those links are not generating pingbacks and trackbacks, I wouldn’t worry too much about them. In fact there are some chances that those links might be passing link juice to your site and helping with your search engine optimization.

Secondly, those links are also good to help Google identify what is the original source of the content. Making sure that scraping sites will link back to the original post is therefore a method to protect your site from search penalties.

If you want to make sure that people scraping your RSS feed will link back to your original post, you just need to use the RSS Footer plugin.

4. If someone else publishes excerpts from my blog, will Google consider them as copies of the same content?

No. Google’s definition of duplicate content is: “substantive blocks of content within or across domains that either completely match other content or are appreciably similar.”

Excerpts are obviously not substantive blocks of content.

5. These sites have page ranks of 0 or 1. Will link backs from such sites help me improve my PR..?

Possibly. It depends on the number of links that those sites will send to you, on whether or not the links are nofollowed, and on the overall quality and relevancy of those websites.

Don’t expect to get a huge PR boost from scrapers though.

Browse all articles on the Q & A category or check the recommended articles for you below:

35 Responses to “How do I Handle Content Scrapers? Can They Hurt My Rankings?”

  • medyum

    Interesting article, I have been noticing more and more of these scraper sites sometimes within hours of me starting a new site!

  • Web Designing Quotes

    Content scrapers have always been a pain but they really don’t hurt if one stays alert about things happening around the blog.

  • Tyrone

    Nice Q&A, The article is really very informative especially for the beginners.

  • Ajay

    From experience I can tell you that a content scrapper can rank higher than your site using your own content. Google rank is not only determined by who posts first and who owns it but also by how relevent that content is to your entire website/blog. If you write about normally about “wine tasting” and suddenly post about “cars” and a content scrapper picks that up (and if the content scrapper) has a specialized blog on “cars”, you may find him rank higher than you. Although if your own post has a lot of comments and tackbacks, the chances of that are reduced…the point I am making that it is possible.

    thanks,

  • Dean Saliba

    Very good article.

    I think most of us have fallen victim to these scrapers at some time and it is bloody annoying.

    Good to see that if they link to my blog I might get better results in search engines and potential to get a better page rank.

  • Daniel Scocco

    @Rahul, not yet.

  • Niche

    Interesting. As long as they include all my links and send me some backlinks, who cares. If I am indexed first, no risk to my PR and more traffic all round

    Sounds like a win win to me

  • Randy

    @Daniel Thanks! The RSS Footer pugin is working fine for me. I think there’s no need for a DMCA for now.

    @Arun Thanks for the link.

  • Rahul Jadhav

    Hi Daniel, I have a Q.

    You have said before that using paid links is against Google policy and we get a PR penalty for it. However I still see lots of blogs using Paid Blog Reviews. Isnt it similar to paid links if not same. You pay for a blog review and you get a link to your site in the post. What is your opinion??

  • Rahul Jadhav

    Hey Daniel i had sent you a link of a blog which was scrapping your blog content. Did you contact him??

  • Tom – Stay At Home Business

    Good post! There will always be people who want to piggy-back on others who write original and useful content.I would not worry about it too much since you are the writer of the original article and you cannot possibly be responsible if someone decides to scrape your content and put it up on their blog.

  • Bacterial Diseases

    I see these things as a bonus. I mean what your really looking at is free backlinks and your content having more chances to be seen

  • Calvin Loh
  • Arun Basil Lal

    @Randy:

    I have some light into that question on how to take down a blog here:

  • Daniel Scocco

    @Randy, you need to send them or the hosting company a DMCA, and if necessary get lawyers in the middle.

  • Randy

    It answers my questions about those stupid scrapers. Also, thanks for mentioning Yoast’s footer plugin.

    I still have one question though. How can I take down the scraper blog. In my case they are publishing the whole post and not just an excerpt. What should I do about it?

    TIA Daniel!

    -Randy

  • Sam Duvall

    If there is just couple sites, then you could try blocking their servers ip adresses from accessing your site/feed.

  • diabetes man

    thanks…… sharing about link building information, well from your explanation….external link is not bad on se eyes…..

  • Arun Basil Lal

    Daniel,

    That clears the mist I had in my mind regarding such spammers. As you said, some trackbacks came through Akismet and I never cared about them. Also, the plugin you told about is a defenite work-around.

    And yes, as you said, the sites that publish excerts is sending in some traffic.

    (btw, this was a surprise for me, I had asked this Q a long time back. Thanks daniel! )

    @Calvin Loh: I disagree with you. First RSS doesnt expand as Really Simple Sindication. Its Rich Site Summary. (Refer: http://www.whatisrss.com/ )
    Second, I dont think RSS was designed for other publishers, it was meant for readers who would like to know when their favorite site is updated. I would call RSS and Reader Subscription Service!

    @Marita: I have been using Google alerts too. I use All-in-one SEO pack to make sure that all my titles have the sitename. But I have never got an alert for a copied page even if they have the name of the site in them. Maybe Google is not indexing them anymore.

    Cheers

  • Make Money Online

    Interesting article, I have been noticing more and more of these scraper sites sometimes within hours of me starting a new site!

  • Marita

    Scrapers can find a site relevant to their niche very easily and quickly by using Google Alerts.

    But so can you! If you want to identify possible scraper sites, set up a Google Alert with the main keywords of your blog (incl. your domain) and Google Alerts will spit out a list of all pages using these keywords, which will also include scraper sites.

  • Calvin Loh

    If you use RSS Footer, remember to also state the Terms and Conditions of using your RSS Feeds clearly on your blog. In other words – don’t just say “Don’t republish my blog posts.” in RSS Footer, make sure you state this in your Terms and Conditions on an easily found page in your blog.

    Remember that RSS was designed to let other publishers grab your stuff easily. That’s why it is called Really Simple Syndication. When you push out your content on RSS, you are implicitly saying “Here I am! Publish me! Publish me!”

  • SEO Tips

    Excellent article, nice Q&A very informative.

  • Daniel Scocco

    @Rich, thanks for sharing.

  • eleena

    Are you still working your way through all the questions you received under your previous Q&A format or are you just selecting general questions that you believe are of interest?

  • Mayank

    Hey Daniel – that indeed is a nice and informative article that clears up many doubts many newbies or amateur blogger have. Content scrapers have always been a pain but they really don’t hurt if one stays alert about things happening around the blog.

  • Rich

    Another informative Q and A here, Daniel. Incidentally, Google Adsense has just reminded its publishers to report to them if they found a site is illegally copying contents and scraping.

    And of course, this sites should also have Google Adsense on them.

    Here is the link to their blog just in case someone needs it –

  • joe comp

    i don’t know about rss footer because i don’t use wordpress.this week google is gonna update Page Rank again.and my blog is one which drop 🙁

  • Rarst

    From my experience such scrapers don’t really target specific blogs, they rely on third party services such as Technorati that track posts and combine headlines by topics.

    I just ignore such and kill trackbacks.

  • Barbara Ling, Virtual Coach

Comments are closed.