Tuesday, 19 January 2016

Re: RFC on Cloud Images: Make /tmp a tmpfs

On Sat, Jan 16, 2016 at 7:49 PM, Clint Byrum <clint@ubuntu.com> wrote:
> Excerpts from Dustin Kirkland's message of 2016-01-16 04:25:58 -0800:
>> On Fri, Jan 15, 2016 at 2:25 AM, Seth Arnold <seth.arnold@canonical.com> wrote:
>> > On Thu, Jan 14, 2016 at 12:27:58PM +0200, Dustin Kirkland wrote:
>> >> Moreover, just 'sudo apt-get install swapspace' and watch as swapfiles
>> >> are created/deleted as needed. If your root disk is lvm-encrypted,
>> >> then obviously such swap files are encrypted, too.
>> >
>> > I've been severely skeptical of the swapspace package:
>> >
>> > - Swap is used when the system is already under pressure; a few hundred
>> > megs is great and probably for the best but if the system is actively
>> > pushing beyond that then it's being pushed too hard.
>> >
>> > - If the swap space is going to be allocated on the fly, that means the
>> > disk blocks have to zeroed on the fly, when the system is under
>> > pressure, rather than at some leisurely time beforehand.
>> >
>> > - If the swap space is allocated on a filesystem, it's probably being
>> > allocated from a fragmented filesystem that's 90% full rather than a
>> > nice contiguous block of space as it would with a swap partition.
>> >
>> > - Accessing further into a file may involve loading multiple indirect
>> > blocks from disk into unswappable kernel memory. A swap partition does
>> > not require indirection blocks.
>> >
>> > - If the swap space allocated from a filesystem pushes the filesystem to
>> > 95% full (or whatever is left after accounting for reserved blocks),
>> > programs will error and almost nothing handles "disk full" errors
>> > gracefully. Swap partitions do not cause surprise gigabyte losses in
>> > free space.
>> >
>> > - Swap files can't be allocated from btrfs filesystems and probably
>> > shouldn't be allocated from zfs filesystems either. (Swap on zvols,
>> > maybe.)
>> >
>> > Perhaps the swapspace package uses some tasteful tunables to mitigate
>> > against my concerns but the end result is that it contributes extra load,
>> > extra IO pressure, and extra uncertainty at a time when the system is
>> > already experiencing too much load, too much IO pressure, and too much
>> > uncertainty.
>> >
>> > The risks and downsides of swapspace feel like a lot compared to the
>> > slight hassle of having the installer make a swap partition.
>> I count 4 "if's", 3 "probably's", 2 "should/would's", and 1 "maybe" in
>> that reply :-)
>> Perhaps try it out?
>> I've been running it and /tmp on tmpfs for several years (since before
>> ~precise) on my desktop on an encrypted LVM partition. My machine has
>> a lot of memory (16GB), though I do push it hard), and have never
>> noticed a swapspace-related problem. I've also used this combination
>> on hundreds of servers, and several production systems.
> The 'if' and 'probably' are missing in your anecdotal evidence though.
> If you use the servers the way you have, it will probably work fine.
> Also we're talking about cloud instances, not "servers", which have
> quite different use and performance profiles.
> I'd like to see even some rudimentary experiments done with realistic
> workloads before saying this is a better idea than leaving things as
> they are. We've all speculated and provided anecdotal evidence enough to
> warrant such an investigation for those who speculate it will be a
> worthwhile change.

Sure, done! You can find a detailed statistical analysis, as well as
the raw data for your download and treatment at:


Based on a statistical analysis of 502 physical and virtual servers
running production and test services at Canonical (including
databases, websites, OpenStack, ubuntu.com, launchpad.net, et al.),
96.6% of them could fit all of the data they currently have in /tmp,
entirely in half of the free memory available in the system. That
ratio goes up to 99.2% of the systems surveyed (i.e., all but 4) when
one takes into account both free available memory and available swap.
The remaining 4 systems are are currently using [101 GB, 42 GB, 13 GB,
and 10 GB] of swap, respectively, and are themselves somewhat special

Moreover, Ubuntu is hardly the first Linux/UNIX distribution that has
considered putting /tmp on tmpfs by default. Solaris has used a tmpfs
since 1994. Fedora moved to /tmp on tmpfs in 2012, as did ArchLinux.
Things seem to be working okay there...

As a recap, the benefits of /tmp on tmpfs are:
- Performance: reads, writes, and seeks are insanely fast in a tmpfs;
as fast as accessing RAM (I tested 1.4GB/s writes and 1.1GB/s reads
to/from tmpfs)
- Security: data leaks to disk are prevented (especially when swap is
disabled), and since /tmp is its own mount point, we should add the
nosuid and nodev options (and motivated sysadmins could optionally add
noexec, if they desire)
- Energy efficiency: disk wake-ups are avoided
- Reliability: fewer NAND writes to SSD disks

Dustin Kirkland
Ubuntu Product & Strategy
Canonical, Ltd.

ubuntu-devel mailing list
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel