Monday, May 12, 2014

Post Scraping

I think Revit OpEd is part of a pretty narrow slice of life on the internets. Having that opinion, I find it bewildering how much comment spam it receives. Even more so that there is another blog out there that just scrapes off my stuff here and posts it on their own as their own.

I used to have a link on my list of BIM blogs that pointed to the site, removed it today. Beginning in 2013 it has apparently shifted from writing original work to just using mine. There isn't any older work, copied or original, there now either.

Since the comments are most likely generated by web-bots I imagine this is also true of the blog in question. I'm writing this to see if they scrape this post and add it to their own too. If they do it will prove to me that there isn't an actual person doing the work of creating the replication of my work.

3 comments:

JB said...

Steve,
My wife had this same issue on her personal blog. It was an exact copy down to names and images of our family.
I filed a DMCA case about it. Gave them a clear evidence (only in writting) and their site was terminated.
It was completely free for me to file this by the way.

On a side note, we also had to turn the RSS feed from our site to NOT show the entire post. Many tools will copy and paste your posts into their site just by using a RSS script. We did want to continue have an RSS feed but that's up to you. (I hope you continue to provide and RSS feed)

Anyways, it was a lot more stress than it really turned out to be as the evidence was pretty irrefutable. Hope that helps.

daveedwards said...

Would going away from Blogger to a self-hosted Wordpress site help?

Steve said...

Thanks for the info. I doubt who hosts the content would matter much. If it is online it can be scraped. I might have a few more options to affect how much can be easily copied with self hosting.

As JB wrote it might help to change the RSS feed to not show the entire post. I hate that format personally but I'll consider doing it if the other site isn't taken down.