Monday 5 December 2022

+1 maintenance report

Unfortunately due to interrupts and illness I didn't manage to get much
actual +1 maintenance done in my shift last week. Another big factor was
that I chose to focus deeply on obscure autopkgtest failures to a depth
I've not done before. This, together with some waiting on some very long
queues, caused me to spend most of my time trying to figure out how
things work, rather than actually making progress in the queues.
Hopefully, by documenting my learning and progress here, that will help
everyone though. But I don't really have any conclusions here; just
mysterious that I'd appreciate someone who understands clearing this up,
and for magic behaviour to be documented.

vorlon sent me a ping suggesting I work on getting perl through, so I
focused on that. Most of the vorlon perl autopkgtest retries were still
queued, especially for arm64.

I found one spurious failure on apache2, retried.

Then I tried to find other things to unblock on the sea of red
autopkgtest failures listed against perl. But with queues appearing so
large, I was reluctant to throw more retests at it in case these retests
were already queued by others. I spent some time trying to figure out
how I could detect if a retry was already queued or running before
submitting it. https://autopkgtest.ubuntu.com/running is helpful there,
but the output was so long it was not practical to match a given entry
against the architecture heading it was under. I thought I recalled a
colleague mentioning something about this in ubuntu-helpers, but there
seemed nothing relevant in its README.md. Some hours later[2] I was
directed to lp-test-isrunning, which is exactly what I needed, and I
added an entry to README.md about this.

Then, it seemed that nearly every failure I looked at matched a known
pattern for which vorlon had already submitted retries that were still
queued. I thought it would be helpful to find failures that did *not*
match the known patterns. But I couldn't find any tool to identify these
either. I spent some time studying existing tooling, and concluded that
retry-autopkgtest-regressions probably has some code I could reuse to
write this. But then by the time I started the next day, only arm64
queues were remaining, so rather than solve this problem still, it was
easier to just focus on the non-arm64 failures.

I found htslib (an rdep of perl) failing on amd64 for what seemed to be
an unrelated reason, so focused on this. This led to a rabbithole of
undocumentation, finding and inspecting code to figure out how to
reproduce it at all, since it succeeded locally for me.

The actual code that converts an autopkgtest request into a run of the
autopkgtest command in infrastructure is
https://git.launchpad.net/autopkgtest-cloud/tree/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker/worker.
Here you can see the handling of arguments like "triggers",
"all-proposed" and magic like "migration-reference/0" and suchlike. This
took a while to find - there's no mention of it from
autopkgtest.ubuntu.com. "Docs for admins" looked promising but neglects
to link it.

As used by infrastructure, --setup-commands
/home/ubuntu/autopkgtest/setup-commands/setup-testbed is presumably
equivalent to /usr/share/autopkgtest/setup-commands/setup-testbed as
shipped by the autopkgtest package itself.

As used by infrastructure, --setup-commands
/home/ubuntu/autopkgtest-cloud/worker-config-production/setup-canonical.sh
can be found at
https://git.launchpad.net/autopkgtest-cloud/tree/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker-config-production/setup-canonical.sh

The mechansism used that translates triggers into magic "just these from
proposed" is done via the --apt-pocket=proposed=... autopkgtest
commandline, so the actual apt pinning is done by autopkgtest itself.
The use of --env=ADT_TEST_TRIGGERS= etc was a red herring for me. I
couldn't find anything in autopkgtest itself that uses this variable. I
only saw it being used during test result collection. So what does
ADT_TEST_TRIGGERS= do, anyway?

However, sometimes you might find --apt-pocket=proposed=... *not* used,
even when triggers appear to exist. I didn't get to the bottom of this.
Maybe it's something to do with magic fallback behaviour I've heard
mentioned (an automatic retry with different conditions?) but couldn't
find anything documented and didn't find any implementation for this.
For example, looking at [1], vorlon's test dated 2022-11-28 19:26:59 UTC
and my subsequent retest dated 2022-11-30 10:20:55 UTC were subsequently
identical, but looking at the logs, his one was called with
--apt-pocket=proposed with no pinning defined, whereas mine was. All I
did was hit the retry button against vorlon's test request and failure,
and mine got run differently! I'd appreciate someone explaining this to
me.

I'm not certain it's related, but it turned out that in infrastructure
libssl3 was being upgraded to the version in the proposed pocket before
the test is run[3]. I never figured out how to reproduce that. But if I
arranged the same start condition: by providing autopkgtest with an
environment where libssl3 was already on 3.0.5-2ubuntu2, as upgraded to
in the autopkgtest log from infrastructure, then I could also see what
appeared to be the same problem. Even though openssl is specified as a
trigger, it seemed that autopkgtest didn't want to take libssl-dev from
the proposed pocket to satisfy test dependencies, and the release pocket
version has a hard versioned dependency on the same libssl3 version.
However it's not clear if this was *the* problem. Two questions I ran
out of time trying to answer then: 1) Why was libssl3 being upgraded
from proposed before the test began? Maybe something to do with
NotAutomatic and pinning? I couldn't reproduce this locally. 2) Why was
libssl-dev not being pulled from proposed even when
--apt-pocket=proposed=openssl was being specified?

In investigating the above, I chose to use lxd since a package
dependency resolution issue should reproduce just fine in a container,
and working with containers is usually much quicker. But when I couldn't
reproduce against ubuntu-daily:lunar, I tried images:ubuntu/lunar/amd64
for a minimalist base image, but that doesn't exist. I found this
surprising given that images:debian/sid/amd64 does exist. I tried
upgrading up from images:ubuntu/kinetic/amd64, and that worked (and
didn't take long, with it being minimal). It also helped me notice the
libssl3 thing, because openssl happens to be higher in kinetic-updates
at the moment than the lunar release pocket, because it hasn't migrated
from lunar proposed yet.

Since then, jawn-smith, vorlon and costamagnagianfranco have all retried
this particular test with various different parameters, and all have
passed. The original failure condition went away, I guess. But all the
logs and command lines are there, and I'm still none the wiser on this
magic behaviour.

[1] https://autopkgtest.ubuntu.com/packages/h/htslib/lunar/amd64
[2] Consider that "some hours" is like 10% or more of my week's entire
shift!
[3] https://autopkgtest.ubuntu.com/results/autopkgtest-lunar/lunar/amd64/h/htslib/20221130_102055_f8a4e@/log.gz