Wednesday 9 June 2021

state of the autopkgtest cloud

Hi all,

we are currently severily limited by having turned off all arm64
bos01 workers, and now bos01 networking being broken.

I disabled bos01 arm64 workers on May 27, because it was a bit
unstable, and xnox wanted to run kernel tests on bos02 only,
and the queues were empty. Last week, queues got filled again,
and we then lagged behind a lot. We know that machines in bos01
have crashing qemus on reboot, as well as failures to boot with
init failing with exitcode=0x0005 or something like that.

Yesterday, we started one instance again, but unfortunately
discovered that networking is completely broken, so the instance
has yet to actually run any tests for us to know how unstable
the bos01 cloud is.

I think the way things are going, we should reenable all bos01
workers ASAP once the networking works again, even if they're
super flaky, as we do seem to be limiting our resources as
shown by the difference in how fast we reduce the queue :D

We need to get the infrastructure issues resolved too, though,
to make sure tests actually work.

--
debian developer - deb.li/jak | jak-linux.org - free software dev
ubuntu core developer i speak de, en

--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel