> For more context for folks at home eating popcorn and enjoying the > show: https://github.com/ClangBuiltLinux/linux/issues/876#issuecomment-613049480. > And that was specifically with KASAN enabled and doesn't appear to be > common behavior in clang otherwise (higher threshold). Why the > heuristics change for when it seems to be more profitable to roll > assignment of contiguous members of the same struct to the same value > into a memset, and 2 longs seems to be the threshold for KASAN, I > don't know. But I agree that should be fixed on the compiler side, > which is why I haven't been pushing the kernel workaround. Given x86 has is a simple 3-instruction loop for memset that will do 1 write/clock (the max on current cpu) I doubt it is ever worth not inlining memset(). The only real special case is lengths < 8. For KASAN I wonder if something is stopping it inlining memset()? So what usually happens is the two stores get converted to memset() and then the memset() gets inlined back to two stores? OTOH all this faffing for memset and memcpy is probably a waste of time. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)