Thursday 13 May 2021

Re: Packaging policy discussion: After=network-online.target

On Thu, May 13 2021 at 17:34:58 +0100, Dimitri John Ledkov
<dimitri.ledkov@canonical.com> wrote:
> On Thu, May 13, 2021 at 4:12 PM Steve Langasek
> <steve.langasek@ubuntu.com> wrote:
>>
>> Hi there,
>>
>> On Wed, May 12, 2021 at 05:52:07PM +1000, Christopher James Halse
>> Rogers wrote:
>> > There's an nfs-utils SRU¹ hanging around waiting for a policy
>> decision on
>> > use of the After=network-online.target systemd unit dependency.
>> I'm not an
>> > expert here, but it looks like part of my SRU rotation today is
>> starting the
>> > discussion on this so we can resolve it one way or another!
>>
>> > I am not an expert in this area, but as I understand it, the
>> tradeoff here
>> > is:
>> > 1. Without a dependency on After=network-online.target there is
>> no guarantee
>> > that the network interface(s) will be usable at the time the
>> nfs-utils unit
>> > triggers, and nfs-utils will fail if the relevant ntwork
>> interface is not
>> > usable, or
>> > 2. With a dependency on After=network-online.target nfs-utils
>> will reliably
>> > start, but if there are any interfaces which are configured but
>> do not come
>> > up this will result in the boot hanging until the timeout is hit.
>>
>> > In mitigation of (2), there are apparently a number of default
>> packages
>> > which already have a dependency on After=network-online.target,
>> so boot
>> > hanging if interfaces are down is the status quo?
>>
>> From one of the comments in the bug report, I gathered that systemd
>> upstream
>> (who, specifically?) was taking a position that distributions
>> should not use
>> After=network-online.target. I think this is entirely unhelpful;
>> the target
>> exists for this purpose, it is not required for systemd internally
>> to get
>> the system up but exists only for other services to depend on.
>>
>> There are risks of services not starting on boot because the
>> network-online
>> target is not reached. However, that is not the same thing as a
>> "hung
>> boot", because other services will still start on their own, and
>> things like
>> gdm and tty don't depend on network-online.target, *unless* you're
>> in a
>> situation where you've introduced a dependency between the
>> filesystem and
>> network-online. This is possible when we're talking about nfs,
>> because the
>> same system might both export nfs filesystems and mount them from
>> localhost.
>> But I'm not sure it should block this specific change.
>>
>> > The obvious thing to do here would be to follow Debian, but as
>> far as I can
>> > tell there is not currently a Debian policy about this - the best
>> I can find
>> > is an ancient draft of a best-practises-guide² suggesting that
>> pacakages
>> > SHOULD handle networking dynamically, but if they do not MUST
>> have a
>> > dependency on After=network-online.target
>>
>> > As far I understand it, handling networking dynamically requires
>> upstream
>> > code changes (although maybe fairly simple code changes?).
>>
>> It does require upstream code changes; not always simple. And it's
>> not
>> always *correct* to make upstream code changes instead of simply
>> starting
>> the service when the system is "online"; you can find a number of
>> examples
>> in Ubuntu of services that it only makes sense to start once your
>> network is
>> "up" - e.g. apt-daily.service, update-notifier, whoopsie, ...
>>
>>
>> There are issues with the network-online target, to be sure. There
>> is not a
>> clear definition of the target, and there have definitely been
>> implementation bugs in what does/does not block the target. I've
>> had
>> discussions with the Foundations Team in the past about this but it
>> has yet
>> to result in a specification.
>>
>> My working definition of what network-online.target SHOULD mean is:
>>
>> - at least one interface is up, with routes
>> - all interfaces which are 'optional: no' (netplan sense) are up
>> - including completion of ipv6 RA and ipv4 link-local if enabled
>> on the
>> interface
>> - there is a default route for at least one configured address
>> family
>> - attempts to discover default routes for other configured address
>> families
>> have completed (success or fail)
>> - DNS is configured
>>
>> Thinks that must not block the network-online target:
>> - interfaces that are marked 'optional: yes'
>> - address sources that are listed in 'optional-addresses' for an
>> interface
>> - default route for an address family for which no interfaces have
>> addresses
>>
>> At least historically, neither networkd nor NetworkManager has
>> fulfilled
>> this definition. It would be nice to get there, but the first step
>> is
>> having some agreed definition such as the above so that we can treat
>> deviations as bugs.
>>
>
> If netplan.io can implement that would be nice. I.e. either
> synthetically (i.e. by generating a service unit on the fly that calls
> systemd-networkd-wait-online with extra arguments specifying all the
> non-optional interfaces) , or by creating a new binary which is
> "netplan-wait-online" which will be wanted by network-online.target
> and perform all of the above.
>
>> > It seems unlikely that, whatever we decide, we'll immediately do
>> a full
>> > sweep of the archive and fix everything, so it looks like our
>> choice is
>> > between:
>>
>> > 1. The long-term goal is to have no After=network-online.target
>> dependencies
>> > in default boot (stretch goal: in main). Whenever we run into a
>> > package-fails-if-network-is-not-yet-up bug, we patch the code and
>> submit
>> > upstream. Over time we audit existing users of
>> After=network-online.target
>> > and patch them for dynamic networking, as time permits.
>>
>> > 2. We don't expect to be able to reach no
>> After=network-online.target
>> > dependencies in the default boot, so it's not a priority to avoid
>> them.
>> > Whenever we run into a package-fails-if-network-is-not-yet-up
>> bug, we add an
>> > After=network-online.target dependency.
>>
>> 3. We expect to reach network-online.target in the common case,
>> but accept
>> that there are systems for which it will ordinarily not be reached
>> on boot
>> (i.e. offline systems). Services which depend on
>> network-online.target
>> should be those which it is reasonable to not start if the system
>> is not
>> connected to the Internet. This includes systems that are
>> connected to a
>> local network, but have no default route.
>>
>
> So from my point of view a short term fix of like having
> After=network-online.target or even
>
> [Unit]
> After=systemd-resolved.service
> [Service]
> ExecStartPre=-/lib/systemd/systemd-networkd-wait-online --any
> --timeout 30
>
> Is fine to be SRUed.
>
> However, I still have the same question - what if network connectivity
> drops & gets re-established? Should we bounce the
> network-online.target (aka restart it)? We can declare for units to be
> restarted, when network-online.target is restarted, if they otherwise
> themselves are incapable to dynamically detect networking loss &
> networking resumption.

Hah! I've actually received a reply off-list relevant to this. They
found network-online.target to be unreliable for nfs & xdmcp.
Apparently because of the spanning tree search by their network
switches the interface would be briefly available, activating relevant
systemd unit dependencies, then not work for about 25 seconds.

>
>>
>> If we use this as the standard, it's easy to see that *in principle*
>> nfs-utils shouldn't depend on there being a route to the global
>> Internet.
>> It does, however, at least give us a framework for understanding the
>> behavior, and for users to modify the behavior if they have
>> different
>> requirements.
>>
>>
>> None of this makes it any safer for an SRU, since at the end of the
>> day if
>> users have such a config that is impacted if you set
>> After=network-online.target for nfs-utils, it would still be a
>> regression.
>>
>> --
>> Steve Langasek Give me a lever long enough and a
>> Free OS
>> Debian Developer to set it on, and I can move the
>> world.
>> Ubuntu Developer
>> https://www.debian.org/
>> slangasek@ubuntu.com
>> vorlon@debian.org
>> --
>> ubuntu-devel mailing list
>> ubuntu-devel@lists.ubuntu.com
>> Modify settings or unsubscribe at:
>> https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
>
>
>
> --
> Regards,
>
> Dimitri.
>
> --
> ubuntu-devel mailing list
> ubuntu-devel@lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel

--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel