From: Siarhei Liakh <siarhei.liakh@xxxxxxxxxxxxxxxxx> SHORT DESCRIPTION: There are two places in unlz4() function where reads beyond the end of a buffer happen under certain conditions which had been observed in real life on stock Ubuntu 20.04 x86_64 with several vanilla mainline kernels, including 5.10.0. As a result of this issue, the kernel fails to decompress LZ4-compressed initramfs with following message showing up in the logs: initramfs unpacking failed: Decoding failed Note that in most cases the affected system is still able to proceed with the boot process to completion. The patch had been tested on Ubuntu 20.04 x86_64 with kernel 5.4.66 and 5.10.0. LONG DESCRIPTION: Background. Not so long ago we've noticed that some of our Ubuntu 20.04 x86_64 test systems often fail to boot newly generated initramfs image. After extensive investigation we determined that a failure required the following combination for our 5.4.66-rt38 kernel with some additional custom patches: Real x86_64 hardware or QEMU UEFI boot Ubunutu 20.04 (or 20.04.1) x86_64 CONFIG_BLK_DEV_RAM=y in .config COMPRESS=lz4 in initramfs.conf Freshly compiled and installed kernel Freshly generated and installed initramfs image In our testing, such a combination would often produce a non-bootable system. It is important to note that [un]bootability of the system was later tracked down to particular instances of initramfs images, and would follow them if they were to be switched around/transferred to other systems. What is even more important is that consecutive re-generations of initramfs images from the same source and binary materials would yield about 75% of "bad" images. Further, once the image is identified as "bad", it always stays "bad"; once one is "good" it always stays "good". Reverting CONFIG_BLK_DEV_RAM to "m" (default in Ubuntu), or changing COMPRESS to "gzip" yields a 100% bootable system. Decompressing "bad" initramfs image with "unmkinitramfs" yields *exactly* the same set of binaries, as verified by matching MD5 sums to those from "good" image. Speculation. Based on general observations, it appears that Ubuntu's userland toolchain cannot consistently generate exactly the same compressed initramfs image, likely due to some variations in timestamps between the runs. This causes variations in compressed lz4 data stream. Further, either initramfs tools or lz4 libraries appear to pad compressed lz4 output to closest 4-byte boundary. lz4 v1.9.2 that ships with Ubuntu 20.04 appears to be able to handle such padding just fine, while lz4 (supposedly v1.8.3) within Linux kernel cannot. Several reports of somewhat similar behavior had been recently circulation through different bug tracking systems and discussion forums [1-4]. I also suspect only that systems which can mount permanent root directly (or with help of modules contained in first, supposedly uncompressed, part of initramfs, or the ones with statically linked modules) can actually complete the boot when LZ4 decompression fails. This would certainly explain why most of Ubuntu systems still manage to boot even after failing to decompress the image. CONFIG_BLK_DEV_RAM=y made system unbootable in our case since unconditional presence of ramdisk block device at time of failing initramfs changed how and where the kernel is looking for /init (basically, looking at ramdisk only). The facts. Regardless of whether Ubuntu 20.04 toolchain produces a valid lz4-compressed initramfs image or not, current version of unlz4() function in kernel has two code paths which had been observed attempting to read beyond the buffer end when presented with one of the "padded"/"bad" initramfs images generated by stock Ubuntu 20.04 toolchain. Some configurations of some 5.4 kernels are known to fail to boot in such cases. This behavior also becomes evident on vanilla 5.10.0-rc3 and 5.10.0-rc4 kernels with addition of two logging statements for corresponding edge cases, even though it does not prevent system from booting in most generic configurations. Further investigation is likely warranted to confirm whether userland toolchain contains any bugs and/or whether any of these cases constitute violation of LZ4 and/or initramfs specification. cc: stable@xxxxxxxxxxxxxxx Signed-off-by: Siarhei Liakh <siarhei.liakh@xxxxxxxxxxxxxxxxx> --- References [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1835660 [2] https://github.com/linuxmint/mint20-beta/issues/90 [3] https://askubuntu.com/questions/1245458/getting-the-message-0-283078-initramfs-unpacking-failed-decoding-failed-wh [4] https://forums.linuxmint.com/viewtopic.php?t=323152 Please CC: me directly on all replies. lib/decompress_unlz4.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/lib/decompress_unlz4.c b/lib/decompress_unlz4.c index c0cfcfd486be..a016643a6dc5 100644 --- a/lib/decompress_unlz4.c +++ b/lib/decompress_unlz4.c @@ -125,6 +125,21 @@ STATIC inline int INIT unlz4(u8 *input, long in_len, continue; } + if (chunksize == 0) { + /* + * Nothing to decode... + * FIXME: this could be an error condition due + * to invalid or corrupt data. However, some + * userspace tools had been observed producing + * otherwise valid initramfs images which happen + * to hit this condition. + * TODO: need to figure out whether the latest + * LZ4 and initramfs specifications allows for + * zero-sized chunks. + * See similar message below. + */ + break; + } if (posp) *posp += 4; @@ -179,6 +194,20 @@ STATIC inline int INIT unlz4(u8 *input, long in_len, else if (size < 0) { error("data corrupted"); goto exit_2; + } else if (size < 4) { + /* + * Ignore any undersized junk/padding... + * FIXME: this could be an error condition due + * to invalid or corrupt data. However, some + * userspace tools had been observed producing + * otherwise valid initramfs images which happen + * to hit this condition. + * TODO: need to figure out whether the latest + * LZ4 and initramfs specifications allows for + * small padding at the end of the chunk. + * See similar message above. + */ + break; } inp += chunksize; } -- 2.17.1