I'm out of the loop, ai
So I see often any Ai bots scraping websites often at high rates. What's the actual go here, are they training without storing the scrapped data? Or is this after training when people are using them (requests to supplement a query, eg tool/mcp usage). Or are people inflating scraping the site 7 times a second to mean the whole site when it's just one page
I'm not doubting the extra load, I'm just curious about how it's not a scrape once and then they are done kind of deal
I'm out of the loop, ai
@xssfox@cloudisland.nz Gonna guess with the rates of people asking about stuff, and what people ask about being all over the place, caching isn't as effective.
Combine the desire to keep data fresh with cache fall-off to avoid it becoming infinite size and that there's 3+ major companies offering "general purpose" NLP search front-ends in addition to existing search engines (Google, etc.) ...
and it's a mess. one big hype-bubbled mess. The pop is gonna be glorious but ugly.