Competitive Intelligence & Social Media Marketing Services
Call Us Free: 1-800-123-4567

What to Do Before & After Your Blog Content is Scraped/Stolen/Copied

What to Do Before & After Your Blog Content is Scraped/Stolen/Copied

Dealing With Content Thieves


You’ve spent months or years building your website and its content, and you’re proud of it. One day you find that some thrown-together website has copied your content and tried to pass it off as its own work.

This is incredibly frustrating. Worse, this duplicate content forces search engines to decide which website is the original and should rank higher. It’s possible that the copied content on the other website may show up above yours in search results.

Here at AboutUs.org / AboutUs.com, our original content that we write is regularly copied/scraped and republished on another websites without any attribution. I want to share what we’ve learned, so you can deal with any site that steals your content.

 

How can I find out if someone steals my content?


It’s a great idea to track people who are talking about you. You can set up a Google Alert for your business name, your website name, and for the titles of important pieces of content or blog posts on your site. Once you’ve done this, you will get an email when a word or phrase for which you’ve created an alert shows up on the web.

Another great way is to search in Google or Bing for a unique sentence from something you wrote, placing quotes around it so that the search engine searches for web pages that mention that exact phrase.

For example, if I wanted to check if some other site had copied this blog post, I might set up a Google Alert for “What to Do Before & After Your Blog Content is Scraped/Stolen/Copied” (the name of this post) and search in Google or Bing for a sentence from the post like “Another great way is to search in Google or Bing for a unique sentence from something you wrote” (with the quotes).

 

Can I prevent my content from being scraped?


Not really. You can try to keep bad actors out of your site, but the tools to do this are really gentlemen’s agreements — and bad actors don’t usually honor these.

You can give it a try, though. The robots.txt file on your website lets you request that certain bots or spiders not crawl your site. You can tell search engine spiders it’s okay to crawl your site, and ask all other bots not to.  Here’s where the gentlemen’s agreement comes in. Robots.txt is code that courteous websites and bots respect. But any website that would send out bots to grab content, and then republish that content without attribution, is unlikely to respect your robots.txt code.

There’s one more thing to keep in mind. Disallowing all bots other than search engine bots can be risky. Sometimes search engines change the names of their bots, and you would have no way of knowing that unless you’re trolling the SEO blogs like a crazy person. If you don’t change the names of the bots you’re allowing, you could end up banning search engines from your site – and you wouldn’t know it until you notice your traffic has plummeted or that you’re not showing up in Google or Bing.

 

One thing you can do if you have a WordPress site


Many scrapers are not so smart and will just copy all the text within your blog post via your RSS feed and then publish it as their own. They will rarely include the “By _____” and so someone reading your content on their website would just think that it was written by that copycat-ing site.

RSS section of Yoast's WordPress SEO plug-in where you can add extra text/links for dumb scrapers to copyIf you have a WordPress blog or website, there is something you can do using a free plug-in called WordPress SEO by Yoast. In the RSS section, you can specify some text/links to accompany your blog post’s content in the RSS feed — For example, “[Name of post] was written by [author] on [name of site].” In that example, the name of the post would show up as the anchor text of a link to the original post on your site.

This way, if a dumb scraper steals your content they will unintentionally give you credit and a backlink to help your SEO and make it easier for Google to tell who the original source is. Note: Some scrapers are more sophisticated, or sometimes there is a human involved who might spot these and take out the credit.

 

So your content has been copied. Now what?


 

Contact the site and ask nicely – but firmly – for attribution or removal

Look for contact information on the website itself. If that doesn’t work, check the site’s whois record on a site like DomainTools.com. The whois record will tell you either who owns the site or who registered it, or both.

If you can’t get in touch with the website owner, or if the responsible person doesn’t respond appropriately, you’ll need to complain to a higher authority.

 

Talk to the people that control their website

Contact the website’s registrar or hosting company to let them know what this site has done. Explain that the owner hasn’t responded to your polite request.

 

Report the duplicate content to Google

    • If you are confident that you have a copyright case, you can report the copyright infringement to Google.
    • If a site is violating a law other than copyright, you can submit a legal removal request. This applies “if you have a court order establishing that a site is in violation of the law, or if you have identified a clear case of a legal violation for which Google has a removal responsibility,” according to Google.
    • You can also try reporting spam to Google. If someone has copied your content without attribution, check the box next to “duplicate site or pages.” Note: Google doesn’t read every spam report. The company normally focuses its attention and action on larger offenders, to have the biggest possible impact on improving search.

 

Shine a light on the miscreants

  • Let everyone know what happened to you, and about the site that grabbed your content. Talk about it in public venues like Twitter and your blog.
  • Make sure the offending site’s online reputation reflects their bad behavior. Give the site a red rating and add an account of what they’ve done on MyWOT.com, a community that monitors website reputation. Try other consumer sounding board sites such as ComplaintsBoard.com, RipoffReport.com and SiteJabber.com.

Make sure your site is fully visible to search engines

  • If you optimize your site well for search engines, you’ll have a much better chance of outranking a content thief in search results.  For free tips on how to outrank, read our SEO articles.  If you’d like an in-depth analysis of your website’s SEO, search engine rankings, and social media presence — all compared to your top 3 competitors — check out what we offer.

 

The Internet Thrives on Trust


We love the openness, vibrancy and ever-changing nature of the web, and we love sharing content. All our content is available under open license, so long as you attribute it to us and include a link back to our site.

Bad actors who take content and fail to attribute break the trust that makes the web a great place to converse, learn and share. Make sure you help by calling people on their bad behavior, and publicizing it. Don’t forget to praise people who do great things on the web, too.

 

Kristina Weis

 

This article was written by Kristina Weis of AboutUs. 

Kristina is customer service and social media lead for AboutUs.  She helps website owners who are trying to promote their businesses online.  Her personal blog is at KristinaWeis.com and she tweets at @KristinaWeis.

 

11 Comments
  1. I hate it when that happens…websites populate their pages with other peoples content or work and then surround it with ads like Google Adsense…oh wait…isn’t that what you folks do?

    BAZINGA!

    • Davil – Yes, on our website directory AboutUs.org we used to include small snippets (a paragraph or less) from the website itself with a note saying it was excerpted from the site, and with a link back to the page where it came from. We were easy to contact and we would remove or edit pages for people, and people could also edit the pages themselves. Many people use AboutUs.org to help promote their sites, and we even offer DoFollow links.

      What I’m talking about in this blog post are people/bots that scrape entire articles, completely try to pass them off as their own without credit or a link. And it’s usually difficult or impossible to contact them to take it down or give some sort of attribution. And if I left a comment on the copycat’s post it would never be visible because they deleted it or never approved it.

      I think the two are pretty different. If you have any questions or feedback for me, feel free to email kristina@aboutus.org.

  2. Thanks, Kristina. You’re right to include the benefits of using WordPress SEO by Yoast. He’s a cool geek. A very smart cool geek. In fact, so smart that the word “cool” may need to be removed from the previous two sentences.

    But in all seriousness, the act of scraping is so unethical, I don’t know how these people sleep at night. If there was a way to name them & shame them publicly by putting them on a “Site of Shame”. Or how about a billboard in the real world. Or maybe put them in the stocks so people could throw rotten fruit & veg at them like in days gone by.

    Thanks again, Kristina, lots of good advice. Keep up the good work.
    Sincerely,
    Michael.

    • Michael – Indeed. Yoast is pretty great, and he’s even been nice and helpful on Twitter when I asked questions.

      The hard part is that many of the worst scrapers have their whois protected, they don’t list any contact info. They’re hiding. So it would be hard to shame the real people behind the site… and they don’t seem to care if their site gets a bad reputation, because they can just just burn it and start one on a new domain :-/

      I have had some good experiences, though. A few times I got in touch with someone at the site, told them what happened, and they apologized, removed the post or gave proper credit, and explained that an intern or guest author had supplied the article and they just trusted that it was theirs.

  3. You shouldn’t have to worry too much about duplicate content (on other sites) if your own site is an authority. Usually Google has indexed (or at least crawled) your new content before it is even published on the copycats.

  4. Do no worry about of this because generally Google control these things .

  5. Very helpful tips for dealing with copied contents. It is becoming more annoying when people just filled with stupid copy and paste their “creative” WordPress pages. 5 Stars of 5 and Tweet for this tips :)

  6. You did not mention the rather useful copyscape service. You can use if for finding pages where they have taken you content and put it through a content ‘spinner’ that tries to make it unique by changing certain key words for synonyms.

    The big advantage to the copyright holder in this case is that the copied and spun article is usually total gibberish, and unlikely to do them much good, buty I’ve found quite a few of my posts on article site stolen this way.

  7. It is true, that’s not a problem at all if references and links are provided. We should give some credit to Google, it most certainly knows who the primary source is. From the SEO point of view, it’s not a big problem either if the references aren’t provided, for the same aforementioned reason. It just sucks, that’s all.

  8. In the internet can happen anything. But up date goegle has new algoritma to kick duplicat content. So that we must create original content. Is right?

Leave a Reply

Get In Touch

800.AboutUs or 503.488.5763 ext. 3

Sales@AboutUs.com

107 SE Washington Street, Suite 520
Portland, OR 97214