Packages buildbot is erratic, both master and 23.05 packages fail often

Petr Štetiar ynezz at true.cz
Thu Jun 1 22:35:14 PDT 2023


Hannu Nyman <hannu.nyman at iki.fi> [2023-06-01 19:11:13]:

Hi,

> * zabbix x13 (and borgbackup plus other python packages before it) seem to
> be the typical last lines before he timeout failures...  Random failures or
> something due to recent changes ???

IIUC then from the buildbot master perspective it seems like the compilation
got stuck somewhere for 1 hour, thats indeed quite strange. I don't remember
seeing those errors previously.

Anyway, as a quick fix attempt, I've now changed build worker CPU affinity, so
each worker uses CPUs from the same NUMA node:

 NUMA node0 CPU(s):               0,2,4,6,8,10,12,14,16,18,20,22
 NUMA node1 CPU(s):               1,3,5,7,9,11,13,15,17,19,21,23

 -    cpuset: 0-5
 +    cpuset: 0,2,4,6,8,10

We've 2 such hosts with 2x Xeon X5680 CPUs, having 4 build workers on each,
using 6 CPUs for each build worker.

So lets see if it improves the situation, I'll try to look at those build
workers more closely now. 

> * with 23.05 the main problem seems to be "No space left on device"...

I've noticed that as well, probably 23.05 is more disk space greedy and
current 60GB allocation is not enough, so going to increase the disk size to
fix that.

I've just added 2x8 temporary build workers to cleanup the build queue and
refresh the packages.

Cheers,

Petr



More information about the openwrt-devel mailing list