On Fri, Nov 10, 2017 at 6:36 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > [ Bringing in the gcc plugin people and the kernel hardening list, > since it now is no longer even remotely looking like a nfsd, vfs or > filesystem issue any more ] > > Kees, Emese, > the whole thread is on lkml, but there's clearly something horribly > wrong with RANDSTRUCT, and it's not new even though it looked that way > for a while. It wouldn't be the first issue we've seen; it's (obviously) a pretty aggressive change to the resulting build. > Patrick seems to trigger it with nfsd, so it might be specific to that. > > Alternatively, it might just be that very few people run > RANDSTRUCT-built kernels, or just have been lucky with the seeding. Given its potential cache-line abuse, I'm not surprised that its usage is more limited than other features. > Sorry for top-posting, but there's not really anything in the email > itself to reply to, other than saying thanks to Patrick for narrowing > it down like this. Agreed; thanks Patrick! :) Given that the issue is non-deterministic, I wonder if the bug is related to some kind of missing RCU or barrier that goes unnoticed in normal struct layouts. > It would have been very interesting if it had actually bisected to > something, but it seems that the real issue is just the choice of > seeding for RANDSTRUCT. That's where we've seen bugs in the past: some pathological ordering of a struct uncovers a corner case. In the past it's been much more deterministic: doesn't build, or immediately crashes on boot, etc. I'll take a closer look at this and see if I can provide something to narrow it down. -Kees > > Linus > > On Fri, Nov 10, 2017 at 4:27 PM, Patrick McLean <chutzpah@xxxxxxxxxx> wrote: >> On 2017-11-10 03:26 PM, Patrick McLean wrote: >>> On 2017-11-10 10:42 AM, Linus Torvalds wrote: >>>> >>>> I really don't see anything that looks even half-way suspicious in >>>> that 4.13.8..11 range. But as mentioned, compiler interactions can be >>>> _really_ subtle. >>>> >>>> And hey, it can be a real kernel bug too, that just happens to be >>>> exposed by RANDSTRUCT, so a bisect really would be very nice. >>> >>> I am working on bisecting the issue now, but I think I have some more >>> evidence pointing to a compiler issue related to RANDSTRUCT. There are >>> actually 3 issues that we have seen. Sometimes we get the null pointer >>> deref in the initial message, sometimes we get the GPF, and sometimes we >>> see an issue where the NFS clients see all files as root-owned >>> directories. Any given kernel will always see the same issue, but after >>> a "make mrproper" and recompile (with the same .config), the issue will >>> often change. I suspect that all 3 of these problems are actually the >>> same issue manifesting itself in different ways depending on what seed >>> the RANDSTRUCT gcc plugin is using. >> >> Further update on this, using the same seed for RANDSTRUCT, I have >> reproduced this issue on v4.13.0, so it does not seem to be recently >> introduced. The older kernel apparently only worked for us because we >> were lucky. Generally we always compile new kernels from a fresh tree, >> so they are never using the same seed. >> >> In case someone wants to play with this, here are some interesting seeds >> (in include/generated/randomize_layout_hash.h): >> >> Produce a NULL pointer dereference (though I am not sure what the client >> does to produce this). >> 5970d6494d0f4236ec57147a46e700f4f501536236d96f6f68ea223e06a258bc >> >> All files for nfsd4 clients appear as directories owned as root, no >> matter the real owner (this happens for all clients we have tested): >> 3f158cd1014800ce5eb6c1f532ac64f2357fdb9a684096557d2fbb1d281f325e >> >> This is the seed that was breaking motherboards (make sure you have a >> way to flash the BIOS with this one): >> 3e32f2d1b4a3dde9f2fd95151386cd1d5bd6167597a0b868f6273aabfc5712dd >> >> Finally, here is a seed that produces a kernel that does not exhibit any >> problems we are aware of: >> e8698c12137fcd1dcbff6d1ed97e5d766128447a27ce9f9d61e0cb8c05ad4d3b >> >>>> >>>> Because in the end, compiler bugs are very rare. They are particularly >>>> annoying when they do happen, though, so they loom big in the mind of >>>> people who have had to chase them down. >>>> -- Kees Cook Pixel Security