Monday, 12 March 2018

Re: zstd compression for packages

On Mon, Mar 12, 2018 at 09:30:16AM -0400, Neal Gompa wrote:
> On Mon, Mar 12, 2018 at 9:11 AM, Daniel Axtens
> <> wrote:
> > Hi,
> >
> > I looked into compression algorithms a bit in a previous role, and to be
> > honest I'm quite surprised to see zstd proposed for package storage. zstd,
> > according to its own github repo, is "targeting real-time compression
> > scenarios". It's not really designed to be run at its maximum compression
> > level, it's designed to really quickly compress data coming off the wire -
> > things like compressing log files being streamed to a central server, or I
> > guess writing random data to btrfs where speed is absolutely an issue.
> >
> > Is speed of decompression a big user concern relative to file size? I admit
> > that I am biased - as an Australian and with the crummy internet that my
> > location entails, I'd save much more time if the file was 6% smaller and
> > took 10% longer to decompress than the other way around.
> >
> > Did you consider Google's Brotli?
> >
> I can't speak for Julian's decision for zstd, but I can say that in
> the RPM world, we picked zstd because we wanted a better gzip.
> Compression and decompression times are rather long with xz, and the
> ultra-high-efficiency from xz is not as necessary as it used to be,
> with storage becoming much cheaper than it was nearly a decade ago
> when most distributions switched to LZMA/XZ payloads.

I want zstd -19 as an xz replacement due to higher decompression speed,
and it also requires about 1/3 less memory when compressing which should
be nice for _huge_ packages.

> I don't know for sure if Debian packaging allows this, but for RPM, we
> switch to xz payloads when the package is sufficiently large in which
> the compression/decompression speed isn't really going to be matter
> (e.g. game data). So while most packages may not necessarily be using
> xz payloads, quite a few would. That said, we've been xz for all
> packages for a few years now, and the main drag is the time it takes
> to wrap everything up to make a package.

We could. But I don't think it matters much.

> As for Google's Brotli, the average compression ratio isn't as high as
> zstd, and is markedly slower. With these factors in mind, the obvious
> choice was zstd.
> (As an aside, rpm in sid/buster and bionic doesn't have zstd support
> enabled... Is there something that can be done to make that happen?)

I'd open a wishlist bug in the Debian bug tracker if I were you. If
we were to introduce a delta, we'd have to maintain it...

debian developer - | - free software dev
ubuntu core developer i speak de, en

ubuntu-devel mailing list
Modify settings or unsubscribe at: