Tuesday, 11 July 2023

Re: Reducing initramfs size and speed up the generation

On Tue, Jul 11, 2023, Seth Arnold wrote:
> On Mon, Jul 10, 2023 at 10:55:06AM +0200, Adrien Nader wrote:
> > There is a little-know but very interesting property of LZMA: its
> > decompression speed does not depend on the uncompressed size but only on
> > the compressed size. What this means is that if you compress a 100MB
> > down to 20MB, it will take roughly twice as long to decompress than if
> > you compress it down to 10MB. In other words, higher compression means
> > faster decompression.
>
> This makes a certain amount of sense -- so much of a computer's
> operational time is spent waiting for data to arrive from memory into
> the processor, refilling cache lines, etc.
>
> You nerd-sniped me into testing a bunch of algorithms on the
> firefox_115.0+build2.orig.tar from our archive.
>
> I only ran these things once, and quite a lot of them ran while a
> second one was running, but this system (dual xeon E5-2630v3) has enough
> processors and memory that it probably didn't matter much.
>
> Times in seconds, with lower level on the left, higher on the right:
>
> 1 3 5 9
> compression:
> gzip 39 46 73 211
> zstd 8 12 23 54
> bzip2 228 237 249 265
> lzma 154 294 643 945
> xz 159 298 644 945
>
> decompression:
> gzip 16 15 15 15
> zstd 3 3 3 3
> bzip2 68 73 74 75
> lzma 41 37 35 33
> xz 36 32 31 30
>
> xz of course absolutely dominates the end file sizes:
>
> 2989486080 original
>
> 515273416 xz -9
> 625958113 zstd -9
> 647365812 xz -1
> 666820870 zstd -5 (seemed like a sweet spot in the timings)
>
> Anyway it's fun to see that gzip and zstd have consistent decompression
> speeds, lzma and xz get faster as they go smaller, and bzip2 just gets
> slower the more it has to think.

There's one more dimension to explore and since it combines with other
settins, it makes things quite a bit more difficult.

With xz you can easily tweak dictionary size (it's roughly equivalent to
a "window size"):

xz -v --lzma2=preset=0,dict=1536M -k firefox_115.0+build2.orig.tar

Two comments: a) xz has a -0 preset, b) 1536M is the maximum dictionary
size for LZMA2.

Zstd has --zstd=windowLog=n where the window size is 2^n. Memory usage
during compression can reach at least 2*2^n; since n<=30, that's <= 2GB.
I'm not sure about the usage during decompression.

Some more data points (obviously CPU times are different):

Compressor Output size Comp. time Decomp. time
-------------------------------------------------------------------
xz -0 704087984 92s 27s
xz -0,dict=1.5G 617397528 168s 22s
xz -1 647365812 118s 24s
xz -9 515273416 753s 20s
xz -9,dict=1.5G 485393536 946s 19s
zstd -1 764765167 5s <3s
zstd -5 666820870 17s <3s
zstd -5,win=18 712243629 18s <3s
zstd -5,win=19 694314044 18s <3s
zstd -9 625958113 34s <3s
zstd -9,win=30 588361999 61s <3s
zstd -19,win=30 509251904 1238s <3s

The presets for xz and zstd are very general-purpose. They're meant to
gradually provider better compression at the expense of CPU and memory
usage. I'm sure we can tweak them and achieve very interesting results.

At the scale of Ubuntu, we might want to deviate from presets simply
because we have a good enough idea of which hardware runs Ubuntu. I know
that the memory usage of xz during decompression was a hot topic ... 15
years ago: NetBSD people had issues on some machine that didn't have
64MB of memory available. We're past that except for maybe some very
specific hardware.

For specific hardware with low memory we might also want to deviate from
presets but doing the opposite. In my tests above, zstd -5 with
windowLog=18 users 13MB of memory while with windowLog=19 it uses 20MB;
zstd -1 uses 16MB.

All this is different from the initial topic but I really believe
there's a lot of potential there for everything we compress.

--
Adrien

--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel