I've been helping some friends and colleagues block some of the site scraping bots that are feeding "AI" models. Decided to take some of my notes and make something others could use too. It's a work-in-progress. Happy to add to or correct things.
https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-website/
AI #companies should respect an opt-in #policy for #authors, not force authors to opt-out. #Copyright must be respected, who does otherwise is simply a #thief or a #pirate.
@elijax respect? from this lot? lol @clarkesworld
@mensrea @clarkesworld cannot really understand your comment, you can explain if you wish!
@elijax just that none of those companies (and others like openai) have any chance of respecting anyone. and they have all specifically shown contempt for creative production @clarkesworld
@mensrea @clarkesworld
I do agree! The exploitation of copyrighted material begun with Google and YouTube making impossible for authors to monetize.
@elijax it began before google. emi, random house, paramount, ... all attempt to do the same thing. the likes of google, amazon, apple, ... have just taken the next step @clarkesworld
@mensrea @elijax @clarkesworld Yup and at this point that includes creative companies like Disney, some companies in general actually steal art from artists directly.
Honestly wish these companies could be punished. :(