Wiki under heavy load by new wave of scrapers

Ted Hess thess at kitschensync.net
Sun Apr 20 08:58:26 PDT 2025


Hi all -

Yes, it has been frustrating lately trying to chase down and deal with 
the most recent DDoS attacks. I briefly tried rate-limiting the entire 
site for all. I'm not sure I was strict enough - It didn't alleviate the 
load while still presenting reasonable service. I removed it but was 
thinking of trying something slightly different. BTW - It takes 19-20 
URL fetches to bring up the home page, so some sore of burst allowance 
must be made.

I don't think Anubis is the ideal solution - It will be one more 
configuration profile to manage and keep up-to-date. Additionally, it 
puts a large requirement on the browser/client capability which is 
probably something we don't need to add to the list things we will need 
to respond to from unsatisfied users. I'm willing to try it anyway -- if 
you all think it would help.

The only other solution(s) that come to mind are more costly.

/ted


On 4/17/2025 6:01:26 PM, "Baptiste Jonglez" 
<baptiste at bitsofnetworks.org> wrote:

>Hello,
>
>The wiki has been under heavy load for a few days because of a new kind of
>scrapers (thank you dear LLM companies)
>
>Requests come from a huge number of residential IP addresses,
>predominantly from Brazil but also from many other countries.
>
>The requests use legitimate-looking User-Agent, but they are very likely
>made-up (among classical ones, there is dubious stuff like Windows 98,
>MacOS PowerPC, Internet Explorer 6...)
>
>As a result, this traffic is extremely difficult to rate-limit or block.
>
>I'm pretty certain that the people behind these residential IPs are being
>paid to serve as proxy for LLM companies scraping, precisely to make the
>traffic very hard to block.
>
>This looks related: https://community.openai.com/t/tips-experience-how-i-used-residential-proxies-to-collect-training-data-for-ai/1230577
>
>Ideas welcome...
>
>Baptiste




More information about the openwrt-adm mailing list