Thursday 4 June 2020

+1 maintenance June 2nd-5th

Hello everyone,
just like some others you've seen the mail already I was working
on +1 duty [17] this week.
This is a summary of what I've done in that regard.

## Tuesday

#### spice FTBFS

I was starting the week late on Tuesday (public holiday FTW) looking at an FTBFS in groovy-proposed. I was formerly looking at the package a week before and things were fine there. Due to that there was a chance that the reason for the FTBFS could affect much more packages, so it was worth to investigate for +1 maintenance as well.
Armhf/arm64 FTBFS [1] in spice that worked fine on a test build [2] at the end of last week.

I did some work on arm64 canonistack, but the builds worked for the old and new version of spice in focal, groovy and groovy-proposed.
Rebuilding the actual archive-builds worked as well.
Hmpf, some wasted time and a bad feeling what might have happened.
But eventually things got unblocked so - minor-positive right :-)

---

#### azure-cli

I was finding that azure-cli had a test that blocked a bunch of others in -proposed.
I checked with the Debian maintainer and realized that we did sync that over the last week but some tests ran missing some of the other components (not the best cross dependencies). I found that logan already unblocked python-azureclient, azure-cli and knacc which migrated already.
But there were a bunch of others six, python-jmespath, javaproperties, python-tz which ran against the partial set and now needed properly constructed retriggering.

The results that came in so far worked with the setup - some are still waiting as the tests queues are rather long at the moment.

---

#### perl

RikMills made me aware of another perl micro-version bump.
That will require at least the no change rebuilds as listed in [3] and maybe more down the road.
But when I checked riscv64 wasn't ready yet - multiple dependent builds blocked on perl 5.30.3 on riskv64.
I was rechecking the build and once completed issued no-change rebuilds for libpar-packer-perl libdevel-cover-perl libclass-xsaccessor-perl libcommon-sense-perl.

It was clear that we might later this week spot more that need a similar handling for this.
Note: Once the rebuilds are completed I'll trigger all these tests again to clear the view of what really is left really broken.


## Wednesday

Since the test queue was so long most of the things I started before still were ongoing.
Also there wasn't much sense in picking up another task with a zillion of related tests.

I was adding bigger tasks I found in excuses, but considered not "actionable" due to the overloaded test queue to the status page for others to pick up later [4].

---

#### plinth

The next best small thing that catched my eye and seemed un-owned was plinth.
It had a test failing in update-motd.

That was interesting as the history showed that this was the case for the last 4 versions of plinth.
Due to that it doesn't seem to resolve itself without some help.

This became an interesting case, with a ubuntu only package (update-motd) depending in the tests on 'freedombox'.
And freedombox in turn being very odd, not only it is unclear what the test needs it for.
In addition it is pulling in a lot of dependencies and to complete things placed /etc/apt/sources.list.d/freedombox2.list which enables buster-backports.

I filed an 'update-excuse' tagged Ubuntu bug [5] to avoid others having to re-debug this - and a Debian bug to please reconsider this behavior.

---

#### perl

The perl related tests have a problem that 1/5 run into the known dependency issues I have issued the rebuilds for.
For the same long tests queues these rebuilds can't surpass and migrate to be used automatically.
Therefore we known
a) most of the perl related tests will fail and need to be retried with the extra deps as triggers
b) there are a lot of perl triggered tests in the queue

If the queue would be empty one could wait until they fail, but in these times that wasted test time is worse.
So I was asking around if anyone could help cancelling the currently queued tests from the queue so that I can insert them with the right triggers to run only once. But gladly it turned out to not be the hundreds/thousands of test restarts that it seemed to become yesterday.
Overnight most of the tests worked, the rebuilds that I have brought into groovy-proposed allowed these tests to work.
The resolver will pick these from -proposed before giving up - tests now for example use the new "libclass-xsaccessor-perl s390x 1.19-3build5".
So I was taking my estimation on to-be-cancelled runs based on a skewed sample :-)

I retriggered 46 already failed tests and we can recheck another day - once the queue is drained - what is left to tackle.

---

#### php-horde

Next was a look at php-horde-* which is not only split into many packages but also has plenty of autopkgtests due to that.
10/10 tests that I checked were blocked at the same issue: "E: Package 'php-horde-test' has no installation candidate"
If there is another issue, then I need this to resolve to be able to see it :-)

It turned out to be rather easy: php-horde-test isn't in groovy-release - not even an older version.
Due to that the tests fail to find anything.

104 source packages and counting :-). The reason for all that was that the status was a mixed feeling before [6] and removed from focal.
It was removed from Debian as well [7] and the current flurry of builds&tests is caused by re-uploads to bring it back.
Some bits are still hanging in Debian's new queue like the core "php-horde 5.2.21+debian1-1" itself.

We should give it a chance now, but if it looks as bad with proper test triggers it likely should be removed until this has resolved to a proper state in Debian (gladly the package was adopted, so this will become better over time).

For now we can't go on, this will need to wait until php-horde passes the new queue and is in groovy-proposed.
Then we want to run something like the following to properly restart the tests.

  $ wget https://people.canonical.com/~ubuntu-archive/proposed-migration/update_excuses.html
  $ for p in $(grep -Hrn '>php-horde-.*</a> (- to <a href=' update_excuses.html | sed -e 's/.*>php-horde-/php-horde-/' | sed -e 's/<\/a>.*//' ); do retry-autopkgtest-regressions --series groovy --blocks "${p}"; done | sed -e 's/$/&trigger=php-horde-test%2F2.6.3%2Bdebian0-5&trigger=php-horde%2F5.2.21%2Bdebian1-1/'

---

#### azure-cli / python-azure

azure-cli of yesterday resolved as expected.
But once the view cleared it showed that "python-azure" is which belongs to the same overall set is still failing.
It is doing so since quite a while, but resolving this is another step in getting "six" out of proposed which is used in many python packages.

Turns out the old test was using the former version '20200130+git-3', but since my efforts of yesterday unblocked all related azure packages that now can be solved with a simple test retry.
Before doing so I was running the test locally in qemu to be sure it would work.
It did, so I scheduled the retries into the long autopkgtest queue.

---

#### probert

Another bit hanging in proposed  is probert. That is due to two component mismatches.
I was talking with Odd_bloke so he can prepare a new upload and updates on [8] to get this eventually resolved.

---

#### Haskell

There are a lot of haskell-* in proposed right now.

After analyzing the initial state there were about 160 FTFBS packages [11] as well as a similar amount of built but unable to migrate depending on others.

The FTFB issues came down to two cases:
  - issues due to a new haddock-interface
  - further build dependency issues due to the above

Eventually this appeared too confusing to be expected to work well (later turned out to be true).
So I was sending a mail to the Debian Maintainer summarizing what I saw and asking about the current state.

---

## Thursday

#### perl

First I checked at which state the perl tests were.
The re-triggered tests of yesterday completed at least on a few arches, and the former dependency issues are resolved as planned.

A few more tests failed - but all causes seemed random and transient.
- dependency issues  on Recommends causing to skip install and failing the test later (libppi-perl, libfile-changenotify-perl)
- infrastructure issues to (re)boot a test system needed restarts (liblocales-perl)
- time based tests running into timeouts which could be due to the high load (libfurl-perl, libio-async-loop-mojo-perl, libdevel-nytprof-perl)
...

Those were just a handful, often working on most but one architecture and just had to be restarted and should be fine next time.
If a few of them turn out to be reproducible on the retry we can take a deeper look into those.

Overall this is in good shape for now, we just need to wait for it to churn through more of the tests.

---

#### plinth

The plinth bug I filed already sees action. After review of what is planned things LGTM.
The current version will stay stuck in groovy-proposed but the coming upload will have a fix that skips the parts that broke us if not running on Debian.
Other packages triggering plinth will test the former 20.3 version that was ok.
The 'update-excuse' bug marks it on the update-excuses page so that everyone knows.
No need to touch it anymore, it will resolve without further +1 help.

---

#### Haskell

After my initial triage on the state of Haskell yesterday I was reaching out to the Debian maintainers.
It turned out that this is an ongoing longer set of changes due to an haddock-interface change.
Things will take a while to be expected to work fine and there is next to no gain to try fixing it in Ubuntu right now.

We can leave them in proposed as-is and continue syncing from Debian.

I asked for a ping once the assumption is that the set of haskell-* packages are again expected to build fine together as we then might need to retrigger a few builds and/or tests to get to the same state.

I documented the state on [4] to pick it up later myself or by whoever is on +1 duty by then.

---

#### azure-cli / python-azure

The python-azure tests I worked on yesterday completed as planned.
Due to that `six` migrated [10] to -release as planned.
Nothing left to do on that context.

---

#### perl

Of the four rebuilds we needed so far libclass-xsaccessor-perl triggered some test errors in `lintian`.
Since that is eventually needed to complete perl I was taking a look at that.

The test history is not looking too bad, at least this kind of issue isn't common.
The issue occurred the same way on multiple architectures, but we only had a no-change rebuild.

Signature of the error is like:
  Not a HASH reference at /tmp/autopkgtest.AaMEBl/build.Cm4/src/lib/Lintian/Pool.pm line 156.
  # Looks like your test exited with 255 before it could output anything.

I was retrying locally with/without the rebuilt libclass-xsaccessor-perl and it was reproducible.
But interestingly the good one was with --apt-pocket=proposed=src:libclass-xsaccessor-perl and the bad one the as-is test.
Was this just flaky after all, I needed some local re-runs?

good  bad
 1     0     all-proposed
 3     0     proposed=src:libclass-xsaccessor-perl
 0     3     groovy-release as-is

This actually turned out to be a dependency issue in regard to libclass-xsaccessor-perl and a custom retry trigger will fix it, queued ...

---

#### ntirpc / nfs-ganesha

This was looked at before and nfs-ganesha needed an update.
To do so it was made a sync a few days ago, but that lost us a delta we need to keep now triggering a component mismatch:
  nfs-ganesha/amd64 unsatisfiable Depends: daemon

I prepared a (re-)merge [12] of our delta which will resolve this situation and a PR to Debian [13] to maybe eventually be able to be a sync.

After getting that approved I uploaded it, that at least will fix the component mismatch.
I'll need to check test results once the queue has got to it.


---

#### ppc64 autopkgtest infrastructure errors

I found that there were a bunch of tests failing on infrastructure issues.
Looked like: "Creating nova instance ...ERROR: autopkgtest"
After chatting with the others it became clear that a few were already spotted and resolved.
I rescanned for these cases and compared the queue.json to not trigger things already enqueued.
Eventually only two more (sepia / pinto) were needed.

---

#### Jinja2 / oca-core

This was in proposed for 37 days already but blocked by a test fail:
  jinja2 (2.10.1-2 to 2.11.1-1) in proposed for 37 days
  Regressions
  oca-core/11.0.20180730-1: amd64 (log, history), arm64 (log, history), armhf (log, history), ppc64el (log, history), s390x (log, history)

Other oca-core tests before and after worked fine according to the logs on autopkgtest.u.c
Oca-core is essentially unchanged since forever and the jinja update is a minor upgrade that shouldn't break it.

The issue was reproducible locally and only occurred with the new jinja.
I found I can switch in/out the error condition by switching the installed version of python3-jinja2 between 2.11.1-1 and 2.10.1-2.

The server starts fine in both cases, but with the new version on the servers error log I got:
  ERROR ? odoo.service.server: Failed to load server-wide module `web`
Followed by a python trace.

TL;DR oca-core really is incompatible with the new jinja2, jinja is the broken one and 2.11 will fix it.
I filed [14] to summarize the case and make if visible from update-excuses

---

## Friday

This was a short day for me due to some private undertakings. But I wanted to at least ensure that all the ongoing +1 aspects I'm on are still continuing to completion.

---

#### ntirpc / nfs-ganesha

The merge I submitted yesterday built and migrated - and as expected that unblocked ntirpc as well.
This is completed, I'm removing it from the wiki's ongoing items [4]

---

#### php-horde

Still waiting in Debian new queue, due to that still nothing to do.

---

#### perl

A bunch of cases failed over night, the rest of those that are completed are good.
I was checking the failing cases

I've found three classes of issues when checking these
- infrastructure issues with nova fails, these just need a retry
  kopanocore,libdist-zilla-plugin-makemaker-awesome-perl,
  libfeed-find-perl,libpdl-netcdf-perl,libsgmls-perl,
  libx11-protocol-perl
- dependency issues, I'll restart those with the known hard version dependency issues
  If they fail still when they ran over the weekend we need to recreate and
  check which further packages need treatment to work together more nicely
  libfurl-perl,libfile-changenotify-perl,libhttp-cookies-perl,
  libmojolicious-perl,libppi-perl,libsearch-elasticsearch-perl,
  libsys-syscall-perl,libterm-filter-perl,libtest-mocktime-perl,
  libxml-rpc-fast-perl
- timeouts and URL issues of some sorts, test history suggests at least some of them might just be flaky
  lintian,mysql-8.0,rspamd

But for some extended joy with all of this we now got [15] which means all these tests will be restarted for the new version anyway.
Due to that I have -not- restarted the old ones I have analyzed above.

To help the test queue one with the ability to do so could cancel the remaining 7329 tests for the older version please.
The regexp that matches on an unescaped [16] would be:
  '\\n{"triggers": \["perl/5.30.3-1"\]'

I've pinged Laney asking for the reject of those from the test queue.
 
---

## Misc

And along all that I did retrigger plenty of known-flaky tests seen to show up in excuses just to get things moving.
Those are not worth to be mentioned individually.s

---

[1]: https://launchpad.net/ubuntu/+source/spice/0.14.3-1ubuntu1
[2]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4078/+packages
[3]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=962012
[4]: https://wiki.ubuntu.com/PlusOneMaintenanceTeam/Status
[5]: https://bugs.launchpad.net/ubuntu/+source/plinth/+bug/1881860
[6]: https://bugs.launchpad.net/ubuntu/+source/php-horde/+bug/1868281
[7]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=942282
[8]: https://bugs.launchpad.net/ubuntu/+source/probert/+bug/1830347
[9]: https://lists.debian.org/debian-haskell/2020/06/msg00003.html
[10]: https://launchpad.net/ubuntu/+source/six/1.15.0-1
[11]: https://paste.ubuntu.com/p/mfRXfQPwJp/
[12]: https://code.launchpad.net/~paelzer/ubuntu/+source/nfs-ganesha/+git/nfs-ganesha/+merge/385106
[13]: https://salsa.debian.org/debian/nfs-ganesha/-/merge_requests/1
[14]: https://bugs.launchpad.net/ubuntu/+source/jinja2/+bug/1882095
[15]: https://launchpad.net/ubuntu/+source/perl/5.30.3-2
[16]: http://autopkgtest.ubuntu.com/queues.json
[17]: https://wiki.ubuntu.com/PlusOneMaintenanceTeam

--
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd