On Fri, Apr 30, 2021 at 1:46 PM Vineet Gupta <Vineet.Gupta1@xxxxxxxxxxxx> wrote: > > I've hit a mainline gcc 10.2 (also gcc 9.3) bug which triggers at -O3 > causing wrong codegen. So it does seem to be a gcc bug or at least mis-feature where gcc gets the aliasing decision wrong when vectorizing the code. That said, the fact that gcc even tries to vectorize the code shows how pointless it was for us to (years ago) try to make it use 16-bit accesses in the first place. So can you try this patch that basically reverts some of those kernel-specific changes to zlib's inffast.c - and does so by just upgrading it to a newer version from a more modern zlib (which in this case still means "from 2017", but that's the most recent version there is). This is a fairly quick hack, and I really tried to keep it to just inffast.c and inftrees.c with a few minimal fixups to inflate.c itself. Most of the changes are for naming (which seems to be at least partly for C++ namespace reasons, ie "this" -> "here"), but there's some cleanup wrt maximum table sizes etc too. NOTE! I have not tested this patch in any other way than "it compiles with allmodconfig". The diff looks reasonable to me, but that's all I'm really going to say. Does this avoid the gcc -O3 problem for you? It would be lovely if somebody also took a look at some of the other zlib code, like inflate.c itself. But some of that code has rather subtle changes for things like s390 hardware support, and we have thihngs like our fallthrough rules etc, so it gets a bit hairier. Which is why I did just this fairly minimal part. Note that the commit message has a "Not-yet-signed-off-by:" from me. That's purely because I wanted it to be obvious that this needs a lot more testing - I'm not comfy with this patch unless somebody takes it upon themselves to actually test it under different loads (and different architectures). Linus
From 3ea219a14bd6f36dfed5e461067349ab86a02935 Mon Sep 17 00:00:00 2001 From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Date: Mon, 3 May 2021 10:27:34 -0700 Subject: [PATCH] Update zlib inffast code to more modern zlib This updates part of our zlib inflate code to the most current zlib sources as of zlib 1.2.11 (released by Mark Adler back in 2017). This is only a partial update to a more modern zlib base, mainly to get the current 'inffast' and 'inftrees' code. In the process, this effectively reverts some of our local kernel optimizations that have caused problems. In particular, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363 which is (I argue) a gcc bug or at least mis-feature, but it does show how the 16-bit access optimizations that do not exist in upstream zlib have caused problems. The effectively reverted commits are 51ea3f6a4571 ("inflate_fast: sout is already a short so ptr arith was off by one.") e69eae65523b ("zlib: make new optimized inflate endian independent") 6846ee5ca68d ("zlib: Fix build of powerpc boot wrapper") ac4c2a3bbe5d ("zlib: optimize inffast when copying direct from output") that do not get carried forward by this update to a more recent version of intffast.c. Note that I've tried to keep the changes minimal wrt the original zlib sources, but there are some changes: - Removal of some zlib-internals: - no more ASMINF - ZLIB_INTERNAL - no Tracevv() tracing calls - Removal of portability crud for ancient compilers: - "FAR" pointer markings - turned "z_const" into just "const" - convert function prototypes to "modern" ANSI C rather than K&R syntax - removed unused "gz_headerp" from inflate_state Also note that this does change inflate.c minimally: in the years since we syncronized with upstream zlib, the '->write' member of 'struct inflate_state' has been renamed to '->wnext'. But this does *not* update all of inflate.c to a newer version, because we do have alot of local changes for static workspace allocations, and even special s390 hardware support. This will need lots of testing. Not-yet-signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> --- lib/zlib_inflate/inffast.c | 162 ++++++++++++++++-------------------- lib/zlib_inflate/inflate.c | 25 +++--- lib/zlib_inflate/inflate.h | 66 ++++++++------- lib/zlib_inflate/inftrees.c | 107 +++++++++--------------- lib/zlib_inflate/inftrees.h | 31 +++---- 5 files changed, 180 insertions(+), 211 deletions(-) diff --git a/lib/zlib_inflate/inffast.c b/lib/zlib_inflate/inffast.c index ed1f3df27260..e3d6fb10d198 100644 --- a/lib/zlib_inflate/inffast.c +++ b/lib/zlib_inflate/inffast.c @@ -1,5 +1,5 @@ /* inffast.c -- fast decoding - * Copyright (C) 1995-2004 Mark Adler + * Copyright (C) 1995-2017 Mark Adler * For conditions of distribution and use, see copyright notice in zlib.h */ @@ -8,24 +8,16 @@ #include "inflate.h" #include "inffast.h" -#ifndef ASMINF - -union uu { - unsigned short us; - unsigned char b[2]; -}; - -/* Endian independed version */ -static inline unsigned short -get_unaligned16(const unsigned short *p) -{ - union uu mm; - unsigned char *b = (unsigned char *)p; - - mm.b[0] = b[0]; - mm.b[1] = b[1]; - return mm.us; -} +/* + * Imported from zlib 1.2.11 with minimal changes: + * - removed ASMINF + * - removed ZLIB_INTERNAL + * - removed "FAR" pointer markings + * - removed Tracevv() tracing calls + * - turned "z_const" into just "const" + * - removed "gz_headerp" from inflate_state + * - converted to "modern" ANSI C rather than K&R syntax + */ /* Decode literal, length, and distance codes and write out the resulting @@ -62,13 +54,13 @@ get_unaligned16(const unsigned short *p) requires strm->avail_out >= 258 for each loop to avoid checking for output space. - - @start: inflate()'s starting value for strm->avail_out + - @start: inflate()'s starting value for strm->avail_out */ void inflate_fast(z_streamp strm, unsigned start) { struct inflate_state *state; const unsigned char *in; /* local strm->next_in */ - const unsigned char *last; /* while in < last, enough input available */ + const unsigned char *last; /* have enough input while in < last */ unsigned char *out; /* local strm->next_out */ unsigned char *beg; /* inflate()'s initial strm->next_out */ unsigned char *end; /* while out < end, enough space available */ @@ -77,7 +69,7 @@ void inflate_fast(z_streamp strm, unsigned start) #endif unsigned wsize; /* window size or zero if not using window */ unsigned whave; /* valid bytes in the window */ - unsigned write; /* window write index */ + unsigned wnext; /* window write index */ unsigned char *window; /* allocated sliding window, if wsize != 0 */ unsigned long hold; /* local strm->hold */ unsigned bits; /* local strm->bits */ @@ -85,7 +77,7 @@ void inflate_fast(z_streamp strm, unsigned start) code const *dcode; /* local strm->distcode */ unsigned lmask; /* mask for first level of length codes */ unsigned dmask; /* mask for first level of distance codes */ - code this; /* retrieved table entry */ + code here; /* retrieved table entry */ unsigned op; /* code bits, operation, extra bits, or */ /* window position, window bytes to copy */ unsigned len; /* match length, unused bytes */ @@ -104,7 +96,7 @@ void inflate_fast(z_streamp strm, unsigned start) #endif wsize = state->wsize; whave = state->whave; - write = state->write; + wnext = state->wnext; window = state->window; hold = state->hold; bits = state->bits; @@ -122,17 +114,17 @@ void inflate_fast(z_streamp strm, unsigned start) hold += (unsigned long)(*in++) << bits; bits += 8; } - this = lcode[hold & lmask]; + here = lcode[hold & lmask]; dolen: - op = (unsigned)(this.bits); + op = (unsigned)(here.bits); hold >>= op; bits -= op; - op = (unsigned)(this.op); + op = (unsigned)(here.op); if (op == 0) { /* literal */ - *out++ = (unsigned char)(this.val); + *out++ = (unsigned char)(here.val); } else if (op & 16) { /* length base */ - len = (unsigned)(this.val); + len = (unsigned)(here.val); op &= 15; /* number of extra bits */ if (op) { if (bits < op) { @@ -149,14 +141,14 @@ void inflate_fast(z_streamp strm, unsigned start) hold += (unsigned long)(*in++) << bits; bits += 8; } - this = dcode[hold & dmask]; + here = dcode[hold & dmask]; dodist: - op = (unsigned)(this.bits); + op = (unsigned)(here.bits); hold >>= op; bits -= op; - op = (unsigned)(this.op); + op = (unsigned)(here.op); if (op & 16) { /* distance base */ - dist = (unsigned)(this.val); + dist = (unsigned)(here.val); op &= 15; /* number of extra bits */ if (bits < op) { hold += (unsigned long)(*in++) << bits; @@ -180,12 +172,34 @@ void inflate_fast(z_streamp strm, unsigned start) if (dist > op) { /* see if copy from window */ op = dist - op; /* distance back in window */ if (op > whave) { - strm->msg = (char *)"invalid distance too far back"; - state->mode = BAD; - break; + if (state->sane) { + strm->msg = + (char *)"invalid distance too far back"; + state->mode = BAD; + break; + } +#ifdef INFLATE_ALLOW_INVALID_DISTANCE_TOOFAR_ARRR + if (len <= op - whave) { + do { + *out++ = 0; + } while (--len); + continue; + } + len -= op - whave; + do { + *out++ = 0; + } while (--op > whave); + if (op == 0) { + from = out - dist; + do { + *out++ = *from++; + } while (--len); + continue; + } +#endif } from = window; - if (write == 0) { /* very common case */ + if (wnext == 0) { /* very common case */ from += wsize - op; if (op < len) { /* some from window */ len -= op; @@ -195,17 +209,17 @@ void inflate_fast(z_streamp strm, unsigned start) from = out - dist; /* rest from output */ } } - else if (write < op) { /* wrap around window */ - from += wsize + write - op; - op -= write; + else if (wnext < op) { /* wrap around window */ + from += wsize + wnext - op; + op -= wnext; if (op < len) { /* some from end of window */ len -= op; do { *out++ = *from++; } while (--op); from = window; - if (write < len) { /* some from start of window */ - op = write; + if (wnext < len) { /* some from start of window */ + op = wnext; len -= op; do { *out++ = *from++; @@ -215,7 +229,7 @@ void inflate_fast(z_streamp strm, unsigned start) } } else { /* contiguous in window */ - from += write - op; + from += wnext - op; if (op < len) { /* some from window */ len -= op; do { @@ -237,54 +251,22 @@ void inflate_fast(z_streamp strm, unsigned start) } } else { - unsigned short *sout; - unsigned long loops; - from = out - dist; /* copy direct from output */ - /* minimum length is three */ - /* Align out addr */ - if (!((long)(out - 1) & 1)) { - *out++ = *from++; - len--; - } - sout = (unsigned short *)(out); - if (dist > 2) { - unsigned short *sfrom; - - sfrom = (unsigned short *)(from); - loops = len >> 1; - do -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS - *sout++ = *sfrom++; -#else - *sout++ = get_unaligned16(sfrom++); -#endif - while (--loops); - out = (unsigned char *)sout; - from = (unsigned char *)sfrom; - } else { /* dist == 1 or dist == 2 */ - unsigned short pat16; - - pat16 = *(sout-1); - if (dist == 1) { - union uu mm; - /* copy one char pattern to both bytes */ - mm.us = pat16; - mm.b[0] = mm.b[1]; - pat16 = mm.us; - } - loops = len >> 1; - do - *sout++ = pat16; - while (--loops); - out = (unsigned char *)sout; - } - if (len & 1) - *out++ = *from++; + do { /* minimum length is three */ + *out++ = *from++; + *out++ = *from++; + *out++ = *from++; + len -= 3; + } while (len > 2); + if (len) { + *out++ = *from++; + if (len > 1) + *out++ = *from++; + } } } else if ((op & 64) == 0) { /* 2nd level distance code */ - this = dcode[this.val + (hold & ((1U << op) - 1))]; + here = dcode[here.val + (hold & ((1U << op) - 1))]; goto dodist; } else { @@ -294,7 +276,7 @@ void inflate_fast(z_streamp strm, unsigned start) } } else if ((op & 64) == 0) { /* 2nd level length code */ - this = lcode[this.val + (hold & ((1U << op) - 1))]; + here = lcode[here.val + (hold & ((1U << op) - 1))]; goto dolen; } else if (op & 32) { /* end-of-block */ @@ -329,7 +311,7 @@ void inflate_fast(z_streamp strm, unsigned start) inflate_fast() speedups that turned out slower (on a PowerPC G3 750CXe): - Using bit fields for code structure - Different op definition to avoid & for extra bits (do & for table bits) - - Three separate decoding do-loops for direct, window, and write == 0 + - Three separate decoding do-loops for direct, window, and wnext == 0 - Special case for distance > 1 copies to do overlapped load and store copy - Explicit branch predictions (based on measured branch probabilities) - Deferring match copy and interspersed it with decoding subsequent codes @@ -338,5 +320,3 @@ void inflate_fast(z_streamp strm, unsigned start) - Larger unrolled copy loops (three is about right) - Moving len -= 3 statement into middle of loop */ - -#endif /* !ASMINF */ diff --git a/lib/zlib_inflate/inflate.c b/lib/zlib_inflate/inflate.c index ee39b5eb71f7..c6d198960e0f 100644 --- a/lib/zlib_inflate/inflate.c +++ b/lib/zlib_inflate/inflate.c @@ -25,6 +25,11 @@ #define INFLATE_NEED_CHECKSUM(strm) 1 #endif +/* Reverse the bytes in a 32-bit value */ +#define REVERSE(q) \ + ((((q) >> 24) & 0xff) + (((q) >> 8) & 0xff00) + \ + (((q) & 0xff00) << 8) + (((q) & 0xff) << 24)) + int zlib_inflate_workspacesize(void) { return sizeof(struct inflate_workspace); @@ -49,7 +54,7 @@ int zlib_inflateReset(z_streamp strm) /* Initialise Window */ state->wsize = 1U << state->wbits; - state->write = 0; + state->wnext = 0; state->whave = 0; INFLATE_RESET_HOOK(strm); @@ -128,22 +133,22 @@ static void zlib_updatewindow(z_streamp strm, unsigned out) copy = out - strm->avail_out; if (copy >= state->wsize) { memcpy(state->window, strm->next_out - state->wsize, state->wsize); - state->write = 0; + state->wnext = 0; state->whave = state->wsize; } else { - dist = state->wsize - state->write; + dist = state->wsize - state->wnext; if (dist > copy) dist = copy; - memcpy(state->window + state->write, strm->next_out - copy, dist); + memcpy(state->window + state->wnext, strm->next_out - copy, dist); copy -= dist; if (copy) { memcpy(state->window, strm->next_out - copy, copy); - state->write = copy; + state->wnext = copy; state->whave = state->wsize; } else { - state->write += dist; - if (state->write == state->wsize) state->write = 0; + state->wnext += dist; + if (state->wnext == state->wsize) state->wnext = 0; if (state->whave < state->wsize) state->whave += dist; } } @@ -675,12 +680,12 @@ int zlib_inflate(z_streamp strm, int flush) copy = out - left; if (state->offset > copy) { /* copy from window */ copy = state->offset - copy; - if (copy > state->write) { - copy -= state->write; + if (copy > state->wnext) { + copy -= state->wnext; from = state->window + (state->wsize - copy); } else - from = state->window + (state->write - copy); + from = state->window + (state->wnext - copy); if (copy > state->length) copy = state->length; } else { /* copy from output */ diff --git a/lib/zlib_inflate/inflate.h b/lib/zlib_inflate/inflate.h index f79337ddf98c..6ece8efd879b 100644 --- a/lib/zlib_inflate/inflate.h +++ b/lib/zlib_inflate/inflate.h @@ -1,8 +1,5 @@ -#ifndef INFLATE_H -#define INFLATE_H - /* inflate.h -- internal inflate state definition - * Copyright (C) 1995-2004 Mark Adler + * Copyright (C) 1995-2016 Mark Adler * For conditions of distribution and use, see copyright notice in zlib.h */ @@ -11,11 +8,17 @@ subject to change. Applications should only use zlib.h. */ -#include "inftrees.h" +/* define NO_GZIP when compiling if you want to disable gzip header and + trailer decoding by inflate(). NO_GZIP would be used to avoid linking in + the crc code when it is not needed. For shared libraries, gzip decoding + should be left enabled. */ +#ifndef NO_GZIP +# define GUNZIP +#endif /* Possible inflate modes between inflate() calls */ typedef enum { - HEAD, /* i: waiting for magic header */ + HEAD = 16180, /* i: waiting for magic header */ FLAGS, /* i: waiting for method and flags (gzip) */ TIME, /* i: waiting for modification time (gzip) */ OS, /* i: waiting for extra flags and operating system (gzip) */ @@ -29,11 +32,13 @@ typedef enum { TYPE, /* i: waiting for type bits, including last-flag bit */ TYPEDO, /* i: same, but skip check to exit inflate on new block */ STORED, /* i: waiting for stored size (length and complement) */ + COPY_, /* i/o: same as COPY below, but only first time in */ COPY, /* i/o: waiting for input or output to copy stored block */ TABLE, /* i: waiting for dynamic block table lengths */ LENLENS, /* i: waiting for code length code lengths */ CODELENS, /* i: waiting for length/lit and distance code lengths */ - LEN, /* i: waiting for length/lit code */ + LEN_, /* i: same as LEN below, but only first time in */ + LEN, /* i: waiting for length/lit/eob code */ LENEXT, /* i: waiting for length extra bits */ DIST, /* i: waiting for distance code */ DISTEXT, /* i: waiting for distance extra bits */ @@ -50,19 +55,21 @@ typedef enum { /* State transitions between above modes - - (most modes can go to the BAD or MEM mode -- not shown for clarity) + (most modes can go to BAD or MEM on error -- not shown for clarity) Process header: - HEAD -> (gzip) or (zlib) - (gzip) -> FLAGS -> TIME -> OS -> EXLEN -> EXTRA -> NAME - NAME -> COMMENT -> HCRC -> TYPE + HEAD -> (gzip) or (zlib) or (raw) + (gzip) -> FLAGS -> TIME -> OS -> EXLEN -> EXTRA -> NAME -> COMMENT -> + HCRC -> TYPE (zlib) -> DICTID or TYPE DICTID -> DICT -> TYPE + (raw) -> TYPEDO Read deflate blocks: - TYPE -> STORED or TABLE or LEN or CHECK - STORED -> COPY -> TYPE - TABLE -> LENLENS -> CODELENS -> LEN - Read deflate codes: + TYPE -> TYPEDO -> STORED or TABLE or LEN_ or CHECK + STORED -> COPY_ -> COPY -> TYPE + TABLE -> LENLENS -> CODELENS -> LEN_ + LEN_ -> LEN + Read deflate codes in fixed or dynamic block: LEN -> LENEXT or LIT or TYPE LENEXT -> DIST -> DISTEXT -> MATCH -> LEN LIT -> LEN @@ -70,23 +77,26 @@ typedef enum { CHECK -> LENGTH -> DONE */ -/* state maintained between inflate() calls. Approximately 7K bytes. */ +/* State maintained between inflate() calls -- approximately 7K bytes, not + including the allocated sliding window, which is up to 32K bytes. */ struct inflate_state { + z_streamp strm; /* pointer back to this zlib stream */ inflate_mode mode; /* current inflate mode */ int last; /* true if processing last block */ - int wrap; /* bit 0 true for zlib, bit 1 true for gzip */ + int wrap; /* bit 0 true for zlib, bit 1 true for gzip, + bit 2 true to validate check value */ int havedict; /* true if dictionary provided */ int flags; /* gzip header method and flags (0 if zlib) */ unsigned dmax; /* zlib header max distance (INFLATE_STRICT) */ unsigned long check; /* protected copy of check value */ unsigned long total; /* protected copy of output count */ - /* gz_headerp head; */ /* where to save gzip header information */ + /* gz_headerp head; */ /* where to save gzip header information */ /* sliding window */ unsigned wbits; /* log base 2 of requested window size */ unsigned wsize; /* window size or zero if not using window */ unsigned whave; /* valid bytes in the window */ - unsigned write; /* window write index */ - unsigned char *window; /* allocated sliding window, if needed */ + unsigned wnext; /* window write index */ + unsigned char *window; /* allocated sliding window, if needed */ /* bit accumulator */ unsigned long hold; /* input bit accumulator */ unsigned bits; /* number of bits in "in" */ @@ -96,8 +106,8 @@ struct inflate_state { /* for table and code decoding */ unsigned extra; /* extra bits needed */ /* fixed and dynamic code tables */ - code const *lencode; /* starting table for length/literal codes */ - code const *distcode; /* starting table for distance codes */ + code const *lencode; /* starting table for length/literal codes */ + code const *distcode; /* starting table for distance codes */ unsigned lenbits; /* index bits for lencode */ unsigned distbits; /* index bits for distcode */ /* dynamic table building */ @@ -105,15 +115,11 @@ struct inflate_state { unsigned nlen; /* number of length code lengths */ unsigned ndist; /* number of distance code lengths */ unsigned have; /* number of code lengths in lens[] */ - code *next; /* next available space in codes[] */ + code *next; /* next available space in codes[] */ unsigned short lens[320]; /* temporary storage for code lengths */ unsigned short work[288]; /* work area for code table building */ code codes[ENOUGH]; /* space for code tables */ + int sane; /* if false, allow invalid distance too far */ + int back; /* bits back of last unprocessed length/lit */ + unsigned was; /* initial length of match */ }; - -/* Reverse the bytes in a 32-bit value */ -#define REVERSE(q) \ - ((((q) >> 24) & 0xff) + (((q) >> 8) & 0xff00) + \ - (((q) & 0xff00) << 8) + (((q) & 0xff) << 24)) - -#endif diff --git a/lib/zlib_inflate/inftrees.c b/lib/zlib_inflate/inftrees.c index 028943052926..a3c64216b5c7 100644 --- a/lib/zlib_inflate/inftrees.c +++ b/lib/zlib_inflate/inftrees.c @@ -1,5 +1,5 @@ /* inftrees.c -- generate Huffman trees for efficient decoding - * Copyright (C) 1995-2005 Mark Adler + * Copyright (C) 1995-2017 Mark Adler * For conditions of distribution and use, see copyright notice in zlib.h */ @@ -36,11 +36,11 @@ int zlib_inflate_table(codetype type, unsigned short *lens, unsigned codes, unsigned fill; /* index for replicating entries */ unsigned low; /* low bits for current root entry */ unsigned mask; /* mask for low root bits */ - code this; /* table entry for duplication */ - code *next; /* next available space in table */ + code here; /* table entry for duplication */ + code *next; /* next available space in table */ const unsigned short *base; /* base value table to use */ const unsigned short *extra; /* extra bits table to use */ - int end; /* use base and extra for symbol > end */ + unsigned match; /* use base and extra for symbol >= match */ unsigned short count[MAXBITS+1]; /* number of codes of each length */ unsigned short offs[MAXBITS+1]; /* offsets in table for each length */ static const unsigned short lbase[31] = { /* Length codes 257..285 base */ @@ -48,7 +48,7 @@ int zlib_inflate_table(codetype type, unsigned short *lens, unsigned codes, 35, 43, 51, 59, 67, 83, 99, 115, 131, 163, 195, 227, 258, 0, 0}; static const unsigned short lext[31] = { /* Length codes 257..285 extra */ 16, 16, 16, 16, 16, 16, 16, 16, 17, 17, 17, 17, 18, 18, 18, 18, - 19, 19, 19, 19, 20, 20, 20, 20, 21, 21, 21, 21, 16, 201, 196}; + 19, 19, 19, 19, 20, 20, 20, 20, 21, 21, 21, 21, 16, 77, 202}; static const unsigned short dbase[32] = { /* Distance codes 0..29 base */ 1, 2, 3, 4, 5, 7, 9, 13, 17, 25, 33, 49, 65, 97, 129, 193, 257, 385, 513, 769, 1025, 1537, 2049, 3073, 4097, 6145, @@ -101,15 +101,15 @@ int zlib_inflate_table(codetype type, unsigned short *lens, unsigned codes, if (count[max] != 0) break; if (root > max) root = max; if (max == 0) { /* no symbols to code at all */ - this.op = (unsigned char)64; /* invalid code marker */ - this.bits = (unsigned char)1; - this.val = (unsigned short)0; - *(*table)++ = this; /* make a table to force an error */ - *(*table)++ = this; + here.op = (unsigned char)64; /* invalid code marker */ + here.bits = (unsigned char)1; + here.val = (unsigned short)0; + *(*table)++ = here; /* make a table to force an error */ + *(*table)++ = here; *bits = 1; return 0; /* no symbols, but wait for decoding to report error */ } - for (min = 1; min < MAXBITS; min++) + for (min = 1; min < max; min++) if (count[min] != 0) break; if (root < min) root = min; @@ -152,11 +152,10 @@ int zlib_inflate_table(codetype type, unsigned short *lens, unsigned codes, entered in the tables. used keeps track of how many table entries have been allocated from the - provided *table space. It is checked when a LENS table is being made - against the space in *table, ENOUGH, minus the maximum space needed by - the worst case distance code, MAXD. This should never happen, but the - sufficiency of ENOUGH has not been proven exhaustively, hence the check. - This assumes that when type == LENS, bits == 9. + provided *table space. It is checked for LENS and DIST tables against + the constants ENOUGH_LENS and ENOUGH_DISTS to guard against changes in + the initial root table size constants. See the comments in inftrees.h + for more information. sym increments through all symbols, and the loop terminates when all codes of length max, i.e. all codes, have been processed. This @@ -168,19 +167,17 @@ int zlib_inflate_table(codetype type, unsigned short *lens, unsigned codes, switch (type) { case CODES: base = extra = work; /* dummy value--not used */ - end = 19; + match = 20; break; case LENS: base = lbase; - base -= 257; extra = lext; - extra -= 257; - end = 256; + match = 257; break; - default: /* DISTS */ + default: /* DISTS */ base = dbase; extra = dext; - end = -1; + match = 0; } /* initialize state for loop */ @@ -195,24 +192,25 @@ int zlib_inflate_table(codetype type, unsigned short *lens, unsigned codes, mask = used - 1; /* mask for comparing low */ /* check available table space */ - if (type == LENS && used >= ENOUGH - MAXD) + if ((type == LENS && used > ENOUGH_LENS) || + (type == DISTS && used > ENOUGH_DISTS)) return 1; /* process all codes and make table entries */ for (;;) { /* create table entry */ - this.bits = (unsigned char)(len - drop); - if ((int)(work[sym]) < end) { - this.op = (unsigned char)0; - this.val = work[sym]; + here.bits = (unsigned char)(len - drop); + if (work[sym] + 1U < match) { + here.op = (unsigned char)0; + here.val = work[sym]; } - else if ((int)(work[sym]) > end) { - this.op = (unsigned char)(extra[work[sym]]); - this.val = base[work[sym]]; + else if (work[sym] >= match) { + here.op = (unsigned char)(extra[work[sym] - match]); + here.val = base[work[sym] - match]; } else { - this.op = (unsigned char)(32 + 64); /* end of block */ - this.val = 0; + here.op = (unsigned char)(32 + 64); /* end of block */ + here.val = 0; } /* replicate for those indices with low len bits equal to huff */ @@ -221,7 +219,7 @@ int zlib_inflate_table(codetype type, unsigned short *lens, unsigned codes, min = fill; /* save offset to next table */ do { fill -= incr; - next[(huff >> drop) + fill] = this; + next[(huff >> drop) + fill] = here; } while (fill != 0); /* backwards increment the len-bit code huff */ @@ -263,7 +261,8 @@ int zlib_inflate_table(codetype type, unsigned short *lens, unsigned codes, /* check for enough space */ used += 1U << curr; - if (type == LENS && used >= ENOUGH - MAXD) + if ((type == LENS && used > ENOUGH_LENS) || + (type == DISTS && used > ENOUGH_DISTS)) return 1; /* point entry in root table to sub-table */ @@ -274,38 +273,14 @@ int zlib_inflate_table(codetype type, unsigned short *lens, unsigned codes, } } - /* - Fill in rest of table for incomplete codes. This loop is similar to the - loop above in incrementing huff for table indices. It is assumed that - len is equal to curr + drop, so there is no loop needed to increment - through high index bits. When the current sub-table is filled, the loop - drops back to the root table to fill in any remaining entries there. - */ - this.op = (unsigned char)64; /* invalid code marker */ - this.bits = (unsigned char)(len - drop); - this.val = (unsigned short)0; - while (huff != 0) { - /* when done with sub-table, drop back to root table */ - if (drop != 0 && (huff & mask) != low) { - drop = 0; - len = root; - next = *table; - this.bits = (unsigned char)len; - } - - /* put invalid code marker in table */ - next[huff >> drop] = this; - - /* backwards increment the len-bit code huff */ - incr = 1U << (len - 1); - while (huff & incr) - incr >>= 1; - if (incr != 0) { - huff &= incr - 1; - huff += incr; - } - else - huff = 0; + /* fill in remaining table entry if code is incomplete (guaranteed to have + at most one remaining entry, since if the code is incomplete, the + maximum code length that was allowed to get this far is one bit) */ + if (huff != 0) { + here.op = (unsigned char)64; /* invalid code marker */ + here.bits = (unsigned char)(len - drop); + here.val = (unsigned short)0; + next[huff] = here; } /* set return parameters */ diff --git a/lib/zlib_inflate/inftrees.h b/lib/zlib_inflate/inftrees.h index b70b4731ac7a..fe4307fcfbe3 100644 --- a/lib/zlib_inflate/inftrees.h +++ b/lib/zlib_inflate/inftrees.h @@ -1,8 +1,5 @@ -#ifndef INFTREES_H -#define INFTREES_H - /* inftrees.h -- header to use inftrees.c - * Copyright (C) 1995-2005 Mark Adler + * Copyright (C) 1995-2005, 2010 Mark Adler * For conditions of distribution and use, see copyright notice in zlib.h */ @@ -38,22 +35,28 @@ typedef struct { 01000000 - invalid code */ -/* Maximum size of dynamic tree. The maximum found in a long but non- - exhaustive search was 1444 code structures (852 for length/literals - and 592 for distances, the latter actually the result of an - exhaustive search). The true maximum is not known, but the value - below is more than safe. */ -#define ENOUGH 2048 -#define MAXD 592 +/* Maximum size of the dynamic table. The maximum number of code structures is + 1444, which is the sum of 852 for literal/length codes and 592 for distance + codes. These values were found by exhaustive searches using the program + examples/enough.c found in the zlib distribtution. The arguments to that + program are the number of symbols, the initial root table size, and the + maximum bit length of a code. "enough 286 9 15" for literal/length codes + returns returns 852, and "enough 30 6 15" for distance codes returns 592. + The initial root table size (9 or 6) is found in the fifth argument of the + inflate_table() calls in inflate.c and infback.c. If the root table size is + changed, then these maximum sizes would be need to be recalculated and + updated. */ +#define ENOUGH_LENS 852 +#define ENOUGH_DISTS 592 +#define ENOUGH (ENOUGH_LENS+ENOUGH_DISTS) -/* Type of code to build for inftable() */ +/* Type of code to build for inflate_table() */ typedef enum { CODES, LENS, DISTS } codetype; -extern int zlib_inflate_table (codetype type, unsigned short *lens, +int zlib_inflate_table (codetype type, unsigned short *lens, unsigned codes, code **table, unsigned *bits, unsigned short *work); -#endif -- 2.31.1.365.ga2a05a39c5