On Thu, Jan 07, 2021 at 10:14:46PM +0000, Russell King - ARM Linux admin wrote: > On Thu, Jan 07, 2021 at 10:48:05PM +0100, Arnd Bergmann wrote: > > On Thu, Jan 7, 2021 at 5:27 PM Theodore Ts'o <tytso@xxxxxxx> wrote: > > > > > > On Thu, Jan 07, 2021 at 01:37:47PM +0000, Russell King - ARM Linux admin wrote: > > > > > The gcc bugzilla mentions backports into gcc-linaro, but I do not see > > > > > them in my git history. > > > > > > > > So, do we raise the minimum gcc version for the kernel as a whole to 5.1 > > > > or just for aarch64? > > > > > > Russell, Arnd, thanks so much for tracking down the root cause of the > > > bug! > > > > There is one more thing that I wondered about when looking through > > the ext4 code: Should it just call the crc32c_le() function directly > > instead of going through the crypto layer? It seems that with Ard's > > rework from 2018, that can just call the underlying architecture specific > > implementation anyway. > > Yes, I've been wondering about that too. To me, it looks like the > ext4 code performs a layering violation by going "under the covers" > - there are accessor functions to set the CRC and retrieve it. ext4 > instead just makes the assumption that the CRC value is stored after > struct shash_desc. Especially as the crypto/crc32c code references > the value using: > > struct chksum_desc_ctx *ctx = shash_desc_ctx(desc); > > Not even crypto drivers are allowed to assume that desc+1 is where > the CRC is stored. It violates how the shash API is meant to be used in general, but there is a test that enforces that the shash_desc_ctx for crc32c must be just the single u32 crc value. See alg_test_crc32c() in crypto/testmgr.c. So it's apparently intended to work. > > However, struct shash_desc is already 128 bytes in size on aarch64, Ard Biesheuvel recently sent a patch to reduce the alignment of struct shash_desc to ARCH_SLAB_MINALIGN (https://lkml.kernel.org/linux-crypto/20210107124128.19791-1-ardb@xxxxxxxxxx/), since apparently most of the bloat is from alignment for DMA, which isn't necessary. I think that reduces the size by a lot on arm64. > and the proper way of doing it via SHASH_DESC_ON_STACK() is overkill, > being strangely 2 * sizeof(struct shash_desc) + 360 (which looks like > another bug to me!) Are you referring to the '2 * sizeof(struct shash_desc)' rather than just 'sizeof(struct shash_desc)'? As mentioned in the comment above HASH_MAX_DESCSIZE, there can be a nested shash_desc due to HMAC. So I believe the value is correct. > So, I agree with you wrt crc32c_le(), especially as it would be more > efficient, and as the use of crc32c is already hard coded in the ext4 > code - not only with crypto_alloc_shash("crc32c", 0, 0) but also with > the fixed-size structure in ext4_chksum(). > > However, it's ultimately up to the ext4 maintainers to decide. As I mentioned in my other response, crc32c_le() isn't a proper library API (like some of the newer lib/crypto/ stuff) but rather just a wrapper for the shash API, and it doesn't handle modules being dynamically loaded/unloaded. So switching to it may cause a performance regression. What I'd recommend is making crc32c_le() able to call architecture-speccific implementations directly, similar to blake2s() and chacha20() in lib/crypto/. Then there would be no concern about when modules get loaded, etc... - Eric