Re: [PATCH 0/4] crypto: Add new compression modes for zlib and IAA

Andre Glover <andre.glover@xxxxxxxxxxxxxxx> · Mon, 01 Apr 2024 13:46:23 -0700

Hi Eric,

Thank you for reviewing the patch. Please see responses to your
questions inline below.

On Thu, 2024-03-28 at 19:46 -0700, Eric Biggers wrote:
> On Thu, Mar 28, 2024 at 10:44:41AM -0700, Andre Glover wrote:
> > The 'canned' compression mode implements a compression scheme that
> > uses a statically defined set of Huffman tables, but where the
> > Deflate
> > Block Header is implied rather than stored with the compressed
> > data. 
> 
> This already exists in standard DEFLATE; it's called fixed mode.  See
> section
> 3.2.6 of RFC1951
> (https://datatracker.ietf.org/doc/html/rfc1951#page-12).
> 
> I think that what's going on is that you've implemented a custom
> variant of
> DEFLATE where you set the fixed Huffman codes to something different
> from the
> ones defined in the standard.
> 
> Is that correct, or are there other differences?
> 

We view it as a variant of dynamic block Deflate as opposed to a
variant of fixed. In particular, it is compressing the input with
static Huffman tables (i.e. ones that do not vary with the input), and
where the Deflate block header (which is a constant) is not stored with
the compressed data. If the missing block header were to be prepended
to the compressed data, the result would be a valid dynamic compressed
block.

One might think of this as vaguely similar to dictionary compression.
In that case, the dictionary is not stored with the compressed data but
is agreed to by the compressor and decompression. In the case of canned
compression, the header is not stored with the compressed data but is
agreed to by both entities.

> Actually, looking at your zlib_tr_flush_block(), it looks instead of
> using the
> reserved block type value (3) or redefining the meaning of the fixed
> block type
> value (1), you actually deleted the BTYPE and BFINAL fields from the
> data stream
> entirely.  So the stream no longer stores the type of block or the
> flag that
> indicates whether the block is the final one or not.
> 
> That has the property that there cannot be any standard blocks, even
> uncompressed blocks, included in the data stream anymore.  Is that
> intentional?
> 

Conceptually, there is a valid dynamic block header associated with the
compressed data, it is just not stored with the data in order to save
space (since it is effectively an unchanging constant). In this usage,
it is envisioned that the output would consist of a single Deflate
block (i.e. the implied header is marked as BFINAL). In theory, one
could have the implied block header not marked as BFINAL, and so the
compressed data would need to contain at least two blocks (i.e. the
body of the initial block, an EOB, and a normal block with header), but
this would be counterproductive to the intended usage.

> Maybe this is why you're using the name "canned", instead of going
> with
> something more consistent with the existing "fixed" name, like
> "custom-fixed"?
> 

Again, we view this as a variant of dynamic compression rather than as
a variant of fixed compression. We use the term "static" to describe
compression with a dynamic block where the tables are unchanging (i.e.
not dependent on the input data) . This can allow the compression to be
done in one pass rather than two. The "canned" compression uses a
static table, but is different in that the header is implied rather
than stored.

> I wonder what the plan is for when the next hardware vendor tries to
> do this and
> chooses their own Huffman codes, different from yours.  Or what if
> Intel decides
> the Huffman codes they chose aren't the best ones anymore and
> releases new
> hardware that uses different codes.  Will we perhaps be getting a
> tinned mode
> too?
> 

The Huffman tables are not built into the hardware. The Huffman
tables/block header is in this case built into the software. The same
tables are embedded in the zlib portion of the patch for software
compression/decompression and in the hardware driver portion of the
patch for IAA compression/decompression. The hardware itself can work
with any desired set of tables.

Thanks,
Andre