Content scrapers are the scum of the earth in the blogging world. They take the hard work and dedication of others and try to capitalize on it for their own monetary gain. Instead of actually doing some work, they decide to try to profit of yours. Content scrappers are as sleazy as a hit and run and as annoying as road rash…so basically…they are just the scum of hte blogging world. While most of the time you are going to ignore content scrapers as they lift your content through your RSS feed or copy/paste, there are going to be times in your blogging when you need to handle a content scraping situation and handle it accordingly.
What Impact Can A Content Scraper Have?
A content scraper, when the situation gets serious enough, can have several adverse affects to your blog that are not easily quantified.
- Traffic loss in search engines
- Brand confusion (same content in two places with different sites/logos)
- Duplicate content issues within search engines
- Copyright violation if your content carries those rights
- In low competition markets, they can rank well for your content
When Should I Plan Action Against A Content Scraper?
There are going to be very few times you really need to stand up and take serious action with content scrapers. Luckily, Google and other search engines have gotten pretty good at sniffing out who as the real content and who is just trying to fake it. Once Google and other search engines figure that out, they penalize the other sites through their algorithm by pushing them down the list. I had a unique situation lately that made one content scraper of my content get on the radar and I had to do something about it.
When I changed domain names from mtbtrailreview.com to Bike198.com at the beginning of the year, I new that Google was going to have to catch up and spider my thousands of pages. I was prepared for the change and knew it would all get sorted out. At the same time, someone decided to scrape the content off my site and put my full feed articles and pictures on a site of their own and the only content they were using was mine.
So what happened?
They started to rank for my content that had not been fully spidered by search engines yet. Big problem.
In this situation, I needed to take action to protect my brand and my content that I have worked so hard to build up.
How Do I Stop A Content Scraper From Stealing My Content?
When you run into this situation or another one that requires action, you have to handle it very carefully.
First, go to the site and see if they have a content form. 99 times out of 100, content scraper sites do not have any form of contact form or email available on the site, so you are going to have to get it stopped by other means. If they do…send an email requesting removal of your content and see what happens. If you get no response or an ugly one, continue.
Go to whois.net and do a whois lookup on the domain that is stealing your content. When you do this, you are going to get the name of the company that hosts the site. Once you have that name, Google the hosting company to contact them directly.
When you contact the hosting company, keep it short and simple. Here is what I wrote.
You have a domain hosted on your servers, scrapersite.com, that is scraping copyrighted content from my blog network, Bike198.com.
Since there is no contact information provided on the site itself, I am contacting you to shut off the content.
If the company is reputable at all (like this one was), you should get a response that looks something like the following.
Rest assured we at [removed].com and it’s parent company [removed] take this type of infringement very seriously. I have passed this message on to the account holder and have given him 24 hours to respond. Failure to respond to this within 24 hours will cause an immediate suspension of the account. Once a response has been made, I will forward the same to you to reach a settlement.
From that point forward, you should be able to get your content removed from the site.
Is It Ever That Easy?!
Well…not always. Every now and then, you will get a site owner that knows he is wrong, admits it and moves on. However, remember who you are dealing with. Content scrapers are stealers of copyrighted content, so a lot of times they do not go down without a fight. Here are some excuses you might here from a typical content scraper as you go through the process. You need to ignore the excuses (You don’t have to explain anything! They are stealing your content!) and stick to your goal of content removal, but here are some explanations.
- “But I Link Back To Your Site” – Hold zero weight for you. Too many one way links have almost zero Google weight and they are stealing traffic off of your content and most readers will not know the difference.
- “RSS Is Built For This” – Nope. RSS is designed for easy digestion of content…not to make it easy to publish copyrighted content on the web. Now, if your content is Creative Commons licensed, republish with credit is allowed.
- “I’ll Just Stop Using Your Site, I Don’t Have Time To Go Through Everything” – Your time is not my problem. You stole my content and it needs to be removed.
Now…there are a couple of things you need to keep at the forefront of your brain as you deal with content scrapers and getting your content off of their site.
- Keep a level head and keep emotion out of the picture. – You have just caught them stealing, so they are going to battle their way out. If you really want to see this issue resolved, you have to keep a level, business head without interjecting emotion no matter what they say to you.
- Work with the hosting company – The hosting company has the power to shut down the site, so as you go through the process…keep a good working relationship with the hosting company as you move forward as they are going to be your partner in this. If you piss them off, things are going to be harder.
- Do not make threats you can not keep – Do not throw the idea out there that you are going to call your lawyers if you do not have lawyers. You need to stick to what you can do and keep to your guns. If you have lawyers…by all means throw the book at them if you need to.
- Don’t believe their threats – In the process, the content scraper is going to throw out threats they can not backup. If they were really that good or had that good of a lawyer, they wouldn’t need to steal your content for a couple of bucks a month.
In the end and about 17 emails later, my content was removed and things are back to normal. Will I need to spend my time hunting down every single content scraper going forward? No. Will this be the last time I have to deal with this? No. I deal with content scraping when it makes sense to invest the time into making sure the situation gets resolved.
Basically, if they hit that hard on my radar, it is time to do something about it.
Scum Image by ۞japaneseblues
Sometimes it’s really hard to deal with content scrapers.
Last month someone copied my whole article and gave me a single link to my article. And that time i did nothing to remove my content from his site…lolz !!!
Anyways, Thanks for sharing this great Post Robb :). Really awesome work man.
Great advice. I’m happy you shared this experience. I often find entire articles copied and pasted (even including my spacer.gif), not to mention all my copyright images. It’s always such a pain and sites like this make it impossible to reach them. I had emails from one lady saying she would make sure that the content was removed. It never was. Now she just ignores my emails. So frustrating. I have yet to go through the host. I will now. 🙂
One way to discover content thieves is to use Google Alerts. I have 2 Google Alerts for both my name and my site so anytime that my content is stolen, I’ll definitely find out about it.
but what if when sharing a article found in a magazine or another website i quote who wrote it or give the address for the full information thats giving the main source credit and not stealing it.
Good tips. Most of the time, I just send a friendly email or leave a short comment that asks for them to remove my content or I’ll contact their host and domain registar. Usually it works.
What really annoys me is when they have the whois private, no email and their comments are closed. That’s when it gets really interesting to find out who their host is.
Agreed that these people are scum of the earth. It’s one thing to review a site, or be inspired, but to try and make someone else’s intellectual property your own is extremely wrong.
Great advice, Robb. I’d also recommend the RSS Footer plugin for anyone who uses WordPress. It enables you to put a URL link back to your site (such as “Original article by X”) so that if it does get pulled in by a scraper site it will be more obvious where the original content came from.
Excellent tips. If you blog for any length of time, you’re going to deal with this issue, and it just sucks, no two ways about it. I agree with the advice to keep a cool head and not to engage directly with the owner. It almost never works. I even had one experience where the guy started stalking me online. On par with an annoying mosquito, to be sure, but mosquitos suck. (Ha. Literally. I made a punny.)
Sounds like you had a really annoying one. Everything get resolved eventually?
Eventually it was fine – after a lawyer buddy sent a sternly worded letter.