On 5/3/21 10:41 AM, Linus Torvalds wrote: > On Fri, Apr 30, 2021 at 1:46 PM Vineet Gupta <Vineet.Gupta1@xxxxxxxxxxxx> wrote: >> I've hit a mainline gcc 10.2 (also gcc 9.3) bug which triggers at -O3 >> causing wrong codegen. > So it does seem to be a gcc bug or at least mis-feature where gcc gets > the aliasing decision wrong when vectorizing the code. > > That said, the fact that gcc even tries to vectorize the code shows > how pointless it was for us to (years ago) try to make it use 16-bit > accesses in the first place. > > So can you try this patch that basically reverts some of those > kernel-specific changes to zlib's inffast.c - and does so by just > upgrading it to a newer version from a more modern zlib (which in this > case still means "from 2017", but that's the most recent version there > is). > > This is a fairly quick hack, and I really tried to keep it to just > inffast.c and inftrees.c with a few minimal fixups to inflate.c > itself. > > Most of the changes are for naming (which seems to be at least partly > for C++ namespace reasons, ie "this" -> "here"), but there's some > cleanup wrt maximum table sizes etc too. > > NOTE! I have not tested this patch in any other way than "it compiles > with allmodconfig". The diff looks reasonable to me, but that's all > I'm really going to say. > > Does this avoid the gcc -O3 problem for you? Yes it does. I built the following: 2021-05-03 b93bedcf8fa4 Update zlib inffast code to more modern zlib 2021-02-26 ea680985468f ARC: fix CONFIG_HARDENED_USERCOPY 2021-04-23 f7118f8ada1b ARC: entry: fix off-by-one error in syscall number validation 2021-04-21 1cb7eefda7ed ARC: kgdb: add 'fallthrough' to prevent a warning 2021-03-22 163630b2d95b arc: Fix typos/spellos 2021-04-11 d434405aaab7 Linux 5.12-rc7 And it seems to be DTRT | Linux version 5.12.0-rc7-00005-gb93bedcf8fa4 (vineetg@vineetg-Latitude-7400) (arc-linux-gcc.br_real (Buildroot 2021.02-6-g5e29ba7bf732) 10.2.0, GNU ld (GNU Binutils) 2.36.1) #678 PREEMPT Mon May 3 11:29:32 PDT 2021 | Memory @ 80000000 [1024M] | ... | with environment: | HOME=/ | TERM=linux | Starting syslogd: OK | Starting klogd: OK | Running sysctl: OK | $ | $ zcat /proc/config.gz | egrep "(OPTIM|COMPRESSION_GZIP)" | CONFIG_INITRAMFS_COMPRESSION_GZIP=y | # CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE is not set | CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y | | $ free -m | total used free shared buff/cache available | Mem: 1012 3 978 31 31 972 | Swap: | > It would be lovely if somebody also took a look at some of the other > zlib code, like inflate.c itself. But some of that code has rather > subtle changes for things like s390 hardware support, and we have > thihngs like our fallthrough rules etc, so it gets a bit hairier. I took a quick look, but there some to be subtle state machine changes in inflate.c which I'm not comfortable mucking around with. > Which is why I did just this fairly minimal part. > > Note that the commit message has a "Not-yet-signed-off-by:" from me. > That's purely because I wanted it to be obvious that this needs a lot > more testing - I'm not comfy with this patch unless somebody takes it > upon themselves to actually test it under different loads (and > different architectures). Maybe give it some time to bake in linux-next for a 5.14 inclusion ? Thx again for jumping in and steering gcc folks to right conclusions. -Vineet