Thursday 21 January 2016

Re: RFC on Cloud Images: Make /tmp a tmpfs

On 21/01/16 02:52, Dustin Kirkland wrote:
> On Wed, Jan 20, 2016 at 5:45 PM, Robie Basak <robie.basak@ubuntu.com> wrote:
>> On Wed, Jan 20, 2016 at 05:27:51AM +0100, Dustin Kirkland wrote:
>>>> I'd like to see even some rudimentary experiments done with realistic
>>>> workloads before saying this is a better idea than leaving things as
>>>> they are. We've all speculated and provided anecdotal evidence enough to
>>>> warrant such an investigation for those who speculate it will be a
>>>> worthwhile change.
>>>
>>> Sure, done! You can find a detailed statistical analysis, as well as
>>> the raw data for your download and treatment at:
>>
>> This is useful. Thank you for this research!
>>
>>> http://blog.dustinkirkland.com/2016/01/data-driven-analysis-tmp-on-tmpfs.html
>>
>> Are we sure that using /dev/zero is a fair test? I hope this isn't
>> shortcutted somehow in the tmpfs case.
>
> No, I'm not sure. But I also can't find any "faster" source of data
> than /dev/zero :-) /dev/urandom can't keep up, nor can any /dev/
>
> Have a look at this page to understand how benchmarking with dd works,
> and what it actually does:
>
> https://romanrm.net/dd-benchmark

For file system benchmarking, I generally use fio [1] with some fio
profiles hand crafted to mimic various file system I/O activity.

My "standard" fio scripts can be found here:
http://kernel.ubuntu.com/git/cking/fs-test-proto.git/tree/fio-tests/jobs

I use these to check out file system performance to check for
regressions so I believe they are good starting point for fio tests for
testing.

However, to stress test kernels and make sure they can handle various
combos of I/O patterns without breaking, I have developed stress-ng [2]
which has a few file system related stressors that may be useful to use
for these kind of experiments.

For example:
stress-ng --temp-path /tmp --hdd 0 --hdd-bytes 1G --hdd-opts wr-rnd,rd-rnd

see the stress-ng for all the different types of --hdd-opts one can use,
it is quite versatile.

There are also chdir, chmod, dentry, dir, fallocate, flock, fstat, link,
rename, symlink, utime, xattr stressors in stress-ng that may be
interesting to use too.

[1] https://github.com/axboe/fio
[2] http://kernel.ubuntu.com/~cking/stress-ng/

>
> I've also copied Colin King, who is the king of performance
> benchmarking, in my book :-)
>
>>> Based on a statistical analysis of 502 physical and virtual servers
>>> running production and test services at Canonical (including
>>> databases, websites, OpenStack, ubuntu.com, launchpad.net, et al.),
>>> 96.6% of them could fit all of the data they currently have in /tmp,
>>> entirely in half of the free memory available in the system. That
>>> ratio goes up to 99.2% of the systems surveyed (i.e., all but 4) when
>>> one takes into account both free available memory and available swap.
>>> The remaining 4 systems are are currently using [101 GB, 42 GB, 13 GB,
>>> and 10 GB] of swap, respectively, and are themselves somewhat special
>>> cases.
>>
>> Even if they are special cases, surely that's something we need to
>> consider for our users? If your data is representative, isn't that
>> around 1% of users who will be impacted or broken somehow by this change
>> in defaults?
>
> In fact it's our job as Ubuntu developers to make intelligent,
> informed, data-driven decisions about opinionated defaults that cover
> the vast majority of our users and clearly document solutions to
> problems affecting the remainder. We do this all the time, in all
> aspects of Ubuntu, from upstream project versions, to default package
> sets and default configuration options!
>
> It's important that we do so carefully, tastefully, scientifically,
> and that we course correct gracefully when we're wrong ;-)
>
>> What would be the guidance for 1) users; and 2) upstreams; if they want
>> large temporary filesystem space after this change? Would that be to use
>> /var/tmp in all relevant cases? And for upstreams, is this something
>> that they will accept that they can do universally, or is it behaviour
>> that they have to differentiate depending on the distro upon which they
>> are running?
>
> Good question... Solutions to insufficient available space in /tmp on
> tmpfs include any and all of the following:
>
> (a) commenting out the "tmpfs /tmp tmpfs rw,nosuid,nodev" line in /etc/fstab
> (b) setting $TMPDIR to /var/tmp (or elsewhere) in your shell profile
> (c) pointing your application at /var/tmp (or elsewhere)
> (d) allocating sufficiently large swap partition(s) or swap file(s)
> to overflow into
> (e) using the swapspace package to dynamically grow/shrink swap on demand
>

It would be also interesting to see what happens when tmpfs is active
when there is a lot of memory pressure. Stress-ng again can help on
this. One can force large chunks of memory to be allocated with
stress-ng to the point where the kernel OOMs the process (and stress-ng
just re-starts it).

So, check out the --brk, --bigheap, --stack stressors to use up all the
free pages in a pathologically excessive way, so they are great at
seeing what happens to tmpfs when memory gets really low.

If you want to just allocate large chunks of memory and just leave a
little bit free, the --mmap stressor (with --mmap-bytes option to
specify the amount of memory to allocate) may be useful.


>>> Moreover, Ubuntu is hardly the first Linux/UNIX distribution that has
>>> considered putting /tmp on tmpfs by default. Solaris has used a tmpfs
>>> since 1994. Fedora moved to /tmp on tmpfs in 2012, as did ArchLinux.
>>> Things seem to be working okay there...
>>
>> This is really useful to know, thanks.
>>
>> Robie


--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel