Tuesday, 31 May 2016

Re: ANN: DNS resolver changes in yakkety

On Tue, May 31, 2016 at 09:38:51PM +0200, Martin Pitt wrote:
> Hello Stéphane,
> Stéphane Graber [2016-05-31 11:23 -0400]:
> > So in the past there were two main problems with using resolved, I'd
> > like to confirm both of them have now been taken care of:
> >
> > 1) Does resolved now support split DNS support?
> > That is, can Network Manager instruct it that only *.example.com
> > should be sent to the DNS servers provided by a given VPN?
> resolved has a D-Bus API SetLinkDomains(), similar in spirit to
> dnsmasq. However, NM does not yet know about this, and only indirectly
> talks to resolved via writing /etc/resolv.conf (again indirectly via
> resolvconf). So the functionality on the resolved is there, but we
> don't use it yet. This is being tracked in the blueprint.

Ok and does it support configuring this per-domain thing through
configuration files?

That's needed so that LXC, LXD, libvirt, ... can ship a file defining a
domain for their bridge which is then forwarded to their dnsmasq

I don't believe we do this automatically anywhere but it was planned to
do it this cycle for LXD and quite possibly for LXC and libvirt too (so
you can resolve <container>.lxd or <vm>.libvirt).

> > 2) Does resolved now maintain a per-uid cache or has caching been
> > disabled entirely?
> No, it uses a global cache.
> > In the past, resolved would use a single shared cache for the whole
> > system, which would allow for local cache poisoning by unprivileged
> > users on the system. That's the reason why the dnsmasq instance we spawn
> > with Network Manager doesn't have caching enabled and that becomes even
> > more critical when we're talking about doing the same change on servers.
> Indeed Tony mentioned this in today's meeting with Mathieu and me --
> this renders most of the efficiency gain of having a local DNS
> resolver moot. Do you have a link to describing the problem? This was
> requested in LP: #903854, but neither that bug nor the referenced
> blueprint explain that.
> How would an unprivileged local user change the cache in resolved? The
> only way how to get a result into resolvconf's cache is through a
> response from the forwarding DNS server. If a user can do that, what
> stops her from doing the same for non-cached lookups?
> The caches certainly need to be dropped whenever the set of
> nameservers *changes*, but this already happens. (But this is required
> for functioning correctly, not necessarily a security guard).
> If you have some pointers to the attack, I'm happy to forward this to
> an upstream issue and discuss it there (or file an issue yourself,
> tha'd be appreciated). If this is an issue, it should be fixed
> upstream, not downstream by disabling caching completely.

I seem to remember it being a timing attack. If you can control when the
initial DNS query happens, which as an unprivileged user you can by just
doing a local DNS query and you know what upstream server is being hit,
which you also know by being able to look at /etc/resolv.conf, then you
can generate fake DNS replies locally (DNS is UDP so the source can
trivially be spoofed) which will arrive before the real reply and end up
in your cache, letting you override any record you want.

For entries that are already cached, you can just query their TTL and
time the attack to begin exactly as the cached record expires.

This would then let an unprivileged user hijack just about any DNS
record unless you have a per-uid cache, in which case they'd only hurt

Anyway, you definitely want to talk to the security team :)

> > Additionally, what's the easiest way to undo this change on a server?
> Uninstall libnss-resolve, or systemctl disable systemd-resolved, I'd
> say.
> > I have a few deployments where I run upwards of 4000 containers on a
> > single system. Such systems have a main DNS resolver on the host and all
> > containers talking to it. I'm not too fond of adding an extra 4000
> > processes to such systems.
> I don't actually intend this to be in containers, particularly as
> LXC/LXD already sets up its own dnsmasq on the host. That's why I only
> seeded it to ubuntu-standard, not to minimal. The
> images.linuxcontainers.org images (rightfully) don't have
> ubuntu-standard, so they won't get libnss-resolve and an enabled
> resolved.

But our recommended images are the cloud images and they sure do include

root@xenial:~# dpkg -l | grep ubuntu-standard
ii ubuntu-standard 1.361 amd64 The Ubuntu standard system

The images.linuxcontainers.org images are tiny images which some of our
users prefer over the recommended official ones published on the Ubuntu
infrastructure. But if the intent is for this change not to affect
containers, then it also must deal with our recommended images.

> Thanks,
> Martin

Stéphane Graber
Ubuntu developer