Thursday, 8 September 2016

Handling qemu machine types for migration

and thanks for your patience - sorry wall-of-text needed given the complexity.

## TL;DR ##

- Qemu machine types are in a bad state since Wily potentially
  affecting Live Migration, Safe/Restore and Upgrades.
- This is the proposal to clean those up and to define the method
  on how to handle machine types and vmstate changes in the future.
- Target is that all live migrations from supported to newer
  releases will work when the default machine type is used
  - except for current Xenial/Wily hosts which will need at
    least one reboot first to clean up the current issues

## Introduction ##

Live migration is generally available quite some time. Despite the complexities of testing live migration across a variety of releases, machine types, and architectures, live migration continues to get better. Yet it still sometimes causes issues, especially when migrating between different host versions. For example in the past we ran into issues [1][2] related to that - even worse a few of them even seem to resurface recently.

The purpose of this proposal is to focus and hone in the quality and user experience of live migration in Ubuntu. There is a good summary on the general steps taken on a migration at [3] if one wants to refresh related to this discussion (especially chapters vmstate, updating devices and subsections).

I went through several discussions about the topic trying to collect the different POVs and I realized that one has to be careful to not get to a contradictory state. I hope this proposal will lay the base for good discussion and eventually properly define how we handle that in the Ubuntu scope.

I have to admit that machine types recently were not handled so well, the definition of wily seems incorrect and there is no type for xenial as well as for no types for non-x86 types at all. So we are kind of in the worst state now, and it is time to clean up. But this is just the reason why we are trying to revise this once more and do better this time. Yes it is urgent, as every further day with the broken wily default in xenial could cause issues even later on, but we have to do it right still or we make it worse.

Just in case anyone might ask for it - aside this discussion about the definition how Ubuntu shall handle machine types we also spun-off an effort to increase QA and test coverage on that particular topic. If one wants to be part of this endeavor please send me a direct mail without a reply-all to this thread.

## Use Cases ##

Consistency - so far we did only do add Distribution specific machine types to the x86 pc-i440fx type. Even though most changes that had caused us to do so actually affected the pc-q35- type as well. The same is can be true for non x86 architectures. So up for discussion, but I think all discussed below should apply to all supported major server architectures (amd64/i386, arm64, ppc64el, s390x).

There are two use-cases that drive the  need. First, we'd like to support users who have deployed VMs on Ubuntu LTSes to be able to live migrate their VMS to the next LTS.  This means, that a qemu VM launched with a machine of 'pc-1.0' should be the same on LTS and LTS-next.  In the past 'pc-XX' upstream types have changed and still have no requirement to not change between qemu releases [1], thus we could not rely upon unversioned upstream machine names. So in the past we introduced a downstream release name [4]. Instead of other downstreams [5] we did so on-top of the upstream types. For backward compatibility we are kind of obliged to keep any released types around as long as they are supported.

OTOH Debian and Fedora have just machine classes as-is upstream.

We are an outlier in the term that we keep the upstream types and only add our own ones.

But today's Upstream "versioned" machine names should be fairly safe since 2.x [6] at least if all devices made a full transition to vmstate. But that is only true if we - as a downstream - add/backport no patches affecting that.

This is not as rare as one might hope, an example of such a change, that is even cross all scsi using architectures can be seen at [7] in the patch for CVE 6351.

Without yet thinking about SRUs yet (below), up to today the default machine type is:


Each following release will keep the previously defined aliases to the specific types for compat. Adding a delta added have to make sure to not only add a new, but also maintain compatibility for the old type. Once our spun-off effort to test such better is in place that could hopefully be used to verify that.

In general we want to make it a Distribution specific type on any release. Instead one could argue that we could evaluate if there is a diff that actually causes any divergence from the usual types. But to do so is severely increased maintenance effort and skill requirement on one hand. And on the other hand it makes a type overview very inconsistent like "where is the one type missing in between those releases?".

The second use-case where the distro release machine type helps is when in the same release we introduce (SRU) new functions that require an update to the machine type.  An example here is on ppc64el where we're backporting a feature from qemu 2.6 which adds a new hardware device that users need to be available by default when creating a machine.  If this feature was added to the pseries-2.5 type, then we have the same issue again of an 'old' VM with type 'pseries-2.5' which does not match an updated qemu where 'pseries-2.5' has a new element now; migration will fail - even updates might.

This second case here drives the need for a "point-release" element to the downstream names.  This is not tied to a usual Ubuntu LTS point release, but to anything introducing a delta to machine type / vmstate. Similar like in CentOS/RHEL, we want:


There are actually two kinds of SRU/Backports one has to consider differently here.

One is a feature backport that should be added into an LTS release. Such things are a planned task and should be batched together to match the sub-releases of an LTS. This will avoid proliferation of those subtypes. Of course if there was no change on any given dot release there is no need to add a new incremented type.

The other case is an SRU for a security or severe bug, these are usually unplanned and have to be taken as an emergency measure. In that case the users usually are encouraged anyway to restart their workload to pick up the change just as you know from e.g. some kernel fixes. VENOM [8], for example affected the floppy device, in particular note the resolution details which indicate the need to run the new binary via stopping/starting, or migration (which invokes the new binary). I think this supports exactly the case here in that when fixing a CVE, it's desirable to retain the same machine-type to support no-downtime "restart" of the binary. Also there is no need nor any good in keeping the old "broken" machine type around - you don't want the ability to "hey I can still start this with the CVE not fixed". So for these cases there will be no bump to the machine type. Users will be unable to migrate from an old broken to a new fixed one system, but they are supposed to restart them anyway. This again will prevent a proliferation of types, but more importantly ensure we won't be forced to keep broken types around.

Finally at some point in the future one has to stop adding ever growing delta. So the thought is to clean out old machine types once no more supported and leave the migration paths roughly matching the supported upgrade paths. That means an LTS unifies former releases and upgrades have to "go through them".

## Summarized Proposal ##

Handle machine types by:
- Add Distribution release specific suffix to the default type(s)
  of each major arch; examples with xenial
  - x86: pc-i440fx-xenial and pc-q35-xenial
  - s390x: s390-ccw-virtio-xenial
  - ppc64el: pseries-xenial
  - arm64: virt-xenial
- Feature backports will add a -%d to the affected types
  - To avoid a proliferation of those types such changes will
    be bundled along LTS dot releases.
  - The -%d suffix will match the related dot release it was
    released with
- bugfix/security SRUs affecting this will not add an increment
  - They will either not affect it anyway (no-op)
  - Or are so important that users have to restart the guests
    anyway to pick up the fix
- Default if no machine type is specified will always point to
  the latest Distribution specific machine type
- We are not dropping upstream types, they are provided as-is
  without further guarantees
  - Cross vendor/downstream migrations might work for upstream
    types, but are considered not supported
  - This was the case ever since, but package doc or so might need
    to be updated to reflect this.
- Cleanup matching the usual supported Distribution upgrade paths
  - Drop former non LTS release definitions after next LTS
  - Drop former LTS release definitions when out of support

## Example ##

An example flow through releases and upgrades:

a release that has a machine type / vmstate diff for all x86 based machines, but none for others.


Gets a xenial feature backport SRU on LTS dot release, but it only affects q35 based machines


Gets an SRU for a CVE, users are supposed to restart to pick fix up

 <no change>

Gets another Feature backport SRU that affects all types on next dot release


In a more visual overview this can be seen at [9].

Please, I encourage you to participate in the discussion now to avoid overhauling this once again in the near future


Old bugs on the topic: