Off topic: Why is there a “gift” code and various tracking paramters in the url?
Url does seem to work without them: https://www.theatlantic.com/politics/archive/2025/03/trump-administration-accidentally-texted-me-its-war-plans/682151/
IP based blocking is complicated once you are big enough
It’s literally as simple as importing an ipset into iptables and refreshing it from time to time. There is even predefined tools for that.
While AI crawlers are a problem I’m also kind of astonished why so many projects don’t use tools like ratelimiters or IP-blocklists. These are pretty simple to setup, cause no/very little additional load and don’t cause collateral damage for legitimate users that just happend to use a different browser.
Can’t wait for all the other horror stories getting posted here :D