Future of git.openwrt.org [Was: Re: Moving git.openwrt.org behind Fastly CDN]

Petr Štetiar ynezz at true.cz
Thu Dec 5 21:59:15 PST 2024


Hannu Nyman <hannu.nyman at iki.fi> [2024-12-04 17:23:39]:

Hi,

tl;dr CDN is not going to cut it, we need some other solution

> a) setting feeds.conf.default to point to the actual root feeds at GitHub
> (e.g. https://github.com/openwrt/packages) instead of the git.openwrt.org
> mirror?

yes, using more powerful Git mirrors is one of the options.

It would need to be git.builds.openwrt.org or such, so we still have it under control.
It would need to be somewhere else than on GitHub (sanctions, no IPv6 etc.).
Following options are circulating around for some time:

 - codeberg.org
 - sourcehut.org

>   Or alternatively, set feeds.conf.default to point to the new
> git.cdn.openwrt.org?

CDNs are not able to handle/cache this Git HTTP smart protocol yet, so it
wouldn't help with Git fetching operations, this would be basically still
passthru mode. CDN is going to lower the load from gitweb based scrapers which
are not using proper Bot user agent in HTTP headers (those are already rate
limited).

I looked into this more closely and there were actually multiple issues going
on simultaneously: 

 1. sudden spikes of requests from various gitweb based scrappers, usually
    requesting source code tarballs (heavy CPU and I/O operation) of random projects

    * bots using proper user agent identification are already forbidden this requests

    * bots not using proper user agent identifaction are PITA because you can't
      distinguish them from humans
 
 2. strange vulnerability scanners, generating a lot of concurrent requests 

 3. relatively high numbers of concurrent builds starting at the same time

    * probably some build farms and/or CI jobs (Hi Qualcomm! :))

This was leading to the saturation of CPU and I/O on the box, long backlog of
requests, running out of resources and 500s hugely impacting our buildbot
builds.

As a quick fix, I've done following in the past days:

 - disabled tarballs for everyone with 403
 - enabled IP based rate limits on everyone

   * heavy projects like luci.git, packages.git and openwrt.git 

     - after 5r/m additional requests are delayed up to 15r/m, then 429 sorry

   * other requests 15r/m, delayed after 8-th request, up to 30r/m, then 429 sorry

Seems to work, VPS can manage the load, no git fetch issues on buildbots, thus
we can focus on the long term solution:

 A. outsource Git operations

    - this is the git.builds.openwrt.org explained above, thus following
      (shortened) diff

        --- a/feeds.conf.default
        +++ b/feeds.conf.default
        -src-git packages https://git.openwrt.org/feed/packages.git
        -src-git luci https://git.openwrt.org/project/luci.git
        -src-git routing https://git.openwrt.org/feed/routing.git
        -src-git telephony https://git.openwrt.org/feed/telephony.git
        +src-git packages https://git.builds.openwrt.org/feed/packages.git
        +src-git luci https://git.builds.openwrt.org/project/luci.git
        +src-git routing https://git.builds.openwrt.org/feed/routing.git
        +src-git telephony https://git.builds.openwrt.org/feed/telephony.git

        --- a/include/download.mk
        +++ b/include/download.mk
        -PROJECT_GIT = https://git.openwrt.org
        +PROJECT_GIT = https://git.builds.openwrt.org
 
        --- a/package/boot/uboot-bcm4908/Makefile
        +++ b/package/boot/uboot-bcm4908/Makefile
        -PKG_SOURCE_URL:=https://git.openwrt.org/project/bcm63xx/u-boot.git
        +PKG_SOURCE_URL:=https://git.builds.openwrt.org/project/bcm63xx/u-boot.git

    - other option is to keep using git.openwrt.org and handle this via HTTP
      redirects, which should probably work as well

 B. improve scripts/feeds

    - add kind of --retry backoff mechanism to Git operations
    - add fallback list of additional Git repository mirrors, if one fails, use another
      etc.

 C. upgrade the box

    - this means $$ which IMO would be better spent on funding/improving projects like
      codeberg.org or sourcehut.org

Cheers,

Petr 



More information about the openwrt-adm mailing list