You might have guessed it already: We are struggling with excessive crawling today. We have - again - blocked several large IP ranges, but were not yet able to identify the new actor.
We are working on restoring service availability and fine-tuning our rate-limiting.
If someone is interested in implementing an improved native rate-limiting in #Forgejo that also protects other instances from abusive crawlers, please reach out
@Codeberg Just wondering what you folks use in front of forgejo. I experience abusive crawling as well but my instance is a small personal one on my homelab, so it's really annoying be losing bandwidth to abusive actors. Considering any self-hosatble WAF in front of my homelab services.
@fmartingr@fosstodon.org a simple solution for some small services is using rate limiting controls within a reverse proxy, or setting up fail2ban so it monitors forgejo/webserver logs... I assume the SSH module can also be used for SSH based clone abuses even
@fmartingr We're using haproxy and have a custom blacklist loaded here: https://codeberg.org/Codeberg-Infrastructure/scripted-configuration/src/commit/bef038ca91cb928e0b865ada4bc6d579b2bc857e/hosts/kampenwand/etc/haproxy/haproxy.cfg#L265
It's not public (yet), but we should probably consider opening it. Would need a check there are only publicly known IP addresses on there, though. I'm not fully up to date with how law considers publishing IP ranges of bad actors. ~f
@Codeberg I was considering trying something like CrowdSec, but unsure how they handle the bad IP ranges and what they consider "bad actors". If we could had something like that but with lists like adblockers do, maintained by the community, it could be nice. Will take a look Thanks and hope you resolve the issue soon!