@FediPact@cyberpunk.lol
γLEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI (Including Many Fediverse Instances!!!)γ
"The tech giant is sidestepping guardrails that websites use to prevent being scraped, data show, in a move whistleblowers say is unethical and potentially illegal."ARTICLE: https://www.dropsitenews.com/p/meta-facebook-tech-copyright-privacy-whistleblower
FULL PDF: https://www.dropsitenews.com/api/v1/file/b3555944-e204-4f5e-9a64-e44281b19a82.pdf
#FediPact #meta #threads #AI
@ophiocephalic@kolektiva.social
Rather than scraping from sites directly, many of the addresses on Metaβs leaked list belong to Content Delivery Networks (CDNs) that are used by websites to cache and store information to improve site performance.This is a critical point. An instance or website can defend itself in numerous different ways, including actively adversarial strategies, and still succumb to extraction - if they're using Cloudflare
cc: @subMedia@kolektiva.social