Tuesday, 14 February 2017

Re: netplan and post-up/pre-down scripts

On Mon, Feb 13, 2017 at 8:21 AM, Mathieu Trudel-Lapierre <mathieu.trudel-lapierre@canonical.com> wrote:
I agree booting should continue for any device that is down even if it's configured and marked as fine to still be down. I think rather than trying to skip a long delay that would happen in some other cases, we should revisit whether it makes sense for it to take 5 minutes; or even make this configurable. 5 minutes is a lot; 1 minute is acceptable in many configurations, 30 seconds is ideal in some. All this only makes sense for DHCP-configured interfaces where the link is up but no DHCP response has been received. If the link is down, there is no point in ever waiting. This may need fixed in networkd and NM to allow configuring the DHCP timeout. 

Yes; perhaps even a kernel parameter would be good. That way it could be customized depending on the purpose of the deployment.

For example, I might want a 1-minute wait time before declaring interfaces "up" if I'm deploying a high availability web proxy with multiple bonded interfaces. Or I might want it to not wait at all if I'm deploying a network infrastructure device, where I would expect that the interface configuration needs to be flexible enough to never prevent me from booting the system.

(Though in general I still think we should move toward eliminating the timeout altogether as a long-term goal, though I realize that may be somewhat idealistic.)

I do not know that any of the different network management tools were planned to address administrative status of an interface. It may even in fact require kernel work (I haven't looked yet) to allow its state to be set to administratively down.

In my (very limited) testing, I can do something like "ip link set dev <device> <down|up>" and when I look at "ethtool <device>" I see "Link detected: <yes|no>". (But I'm not sure what the initial state of the interface is before ifupdown touches it.)
Please file a bug about this. I'll review and look how we can specify this in netplan, and how we can drive it in the various backends... But I expect it will require work in both networkd and NetworkManager before it does work correctly. Servers aren't typically used in such a way, and setting that option seems like it might be quite intrusive (I mean, of course the default wouldn't be for interfaces to be administratively down, but I can see issues coming up from it. One of them is how to do it in the first place, and another is that it would only happen once the renderer (networkd/NM) is up and running, so potentially you've already sent/received packets on the interfaces... ideally, that shouldn't happen, but maybe I'm overthinking it).

Right; it might make a difference what the initial state of the interface is (before ifupdown, networkd, or NetworkManager take control of it). I'm pretty sure I've observed ifupdown bringing interfaces online, and leaving interfaces offline if they aren't mentioned in /etc/network/interfaces (but I'd have to double check the initial state of the interface). So it might be that this is already happening, at least to some extent.

But I agree; ideally the service which renders the network configuration will bring the interfaces online, to avoid the link coming up before it's ready. (But if we were to iterate on this, I would say that the interface coming up in a 'default up' state for a short time would be a small price to pay on the path toward getting this right.)

I'll sleep on it and try to get a bug filed tomorrow.

Of course. If something is administratively down, we must think of it in terms of *powered down*. Otherwise it's simply not configured. 

Indeed, from the user perspective, it should seem to be powered down (in that if an Ethernet cable is connected which would otherwise show a link light, you would see nothing appear to happen). I think setting the interface link down should be enough to accomplish that; hopefully the driver would know that it could be placed into a power save mode or similar.