Wednesday, 8 March 2023

DEP8 timeouts on install phase


taking this as a sample:

It looks like something got slower since a few days ago, and tests started being killed due to a timeout ("kind: install" timeout, which has a default of 3000s in autopkgtest).

By the end of February 2023, the passing tests were taking about 20min. Then a little bit more up to 26min in March 2nd, and then they all started to fail.

Looking at the logs, we see apt-get install being interrupted by the autopkgtest timeout mid package download:
Get:242 http://ftpmaster.internal/ubuntu lunar/universe ppc64el apksigner all 31.0.2-1 [438 kB]  Get:243 http://ftpmaster.internal/ubuntu lunar/main ppc64el openjdk-17-jdk-headless ppc64el 17.0.6+10-1 [217 MB]  autopkgtest-virt-ssh [04:23:09]: ------- nova console-log 766e17f7-5365-4cdf-8202-35b6b88a7cdc (adt-lunar-ppc64el-fdroidserver-20230303-012357-lrg-root5) ------

When the test were passing (in the 20min range), they were also downloading hundreds of packages. So something got slower since then. Or this is an isolated case.

We can add fdroidserver/ppc64el to "big_packages", or perhaps "long_tests", if this is expected, or if it will take a long time to be fixed. Otherwise I don't know what to do here, as just retrying will just burn through more energy without any clear indication that it will work.

FYI, fdroidserver/ppc64el is currently one of the packages blocking the openjdk-* migration.

I did a check with ./retry-autopkgtest-regressions --log-regex '\(kind: install\)'
That returned 82 urls[1]:
s390x: 32
ppc64el: 31
arm64: 19

Should we wait and see if it gets better, assuming there is ongoing work in the DEP8 infra? Or bump the timeout? Or something else?