> > I haven't really followed KMSAN development but I would have expected > > that it would, like other debugging tools, add its metadata to page_ext > > rather than page directly. > > Yes, that would have been preferable. Also, I don't understand why we > need an entire page to store whether each "bit" of a page is initialised. > There are no CPUs which have bit-granularity stores; either you initialise > an entire byte or not. So that metadata can shrink from 4096 bytes > to 512. It's not about bit-granularity stores, it's about bits being uninitialized or not. Consider the following struct: struct foo { char a:4; char b:4; } f; - if the user initializes f.a and then tries to use f.b, this is still undefined behavior that KMSAN is able to catch thanks to bit-to-bit shadow, but would not have been able to detect if we only stored one bit per byte. Another example is bit flags or bit masks, where you can set a single bit in an int32, but that wouldn't necessarily mean the rest of that variable is initialized. It's worth mentioning that even if we choose to shrink the shadows from 4096 to 512 bytes, there'd still be four-byte origin IDs, which are allocated for every four bytes of program memory. So a whole page of origins will still be required in addition to those 512 bytes of shadow. (Origins are handy when debugging KMSAN reports, because a single uninit value can be copied or modified multiple times before it is used in a branch or passed to the userspace. Shrinking origins further would render them useless for e.g. 32-bit local variables, which is a quite common use case).