On Fri, Nov 10, 2017 at 03:26:27PM -0800, Patrick McLean wrote: > > > On 2017-11-10 10:42 AM, Linus Torvalds wrote: > > On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean <chutzpah@xxxxxxxxxx> wrote: > >> > >> Something must have changed since 4.13.8 to trigger this though. > > > > Arnd pointed to some commits that might be relevant for the cp210x > > module, but those are all already in 4.13.8, so if 4.13.8 really is > > rock solid for you, I don't think that's it. > > > > I really don't see anything that looks even half-way suspicious in > > that 4.13.8..11 range. But as mentioned, compiler interactions can be > > _really_ subtle. > > > > And hey, it can be a real kernel bug too, that just happens to be > > exposed by RANDSTRUCT, so a bisect really would be very nice. > > I am working on bisecting the issue now, but I think I have some more > evidence pointing to a compiler issue related to RANDSTRUCT. There are > actually 3 issues that we have seen. Sometimes we get the null pointer > deref in the initial message, sometimes we get the GPF, and sometimes we > see an issue where the NFS clients see all files as root-owned > directories. That suggests that stat.uid is 0 and stat.mode & S_IFMT is 0040000 in the stat structure that nfsd passed to vfs_getattr(). No idea what sort of information is useful when tracking down this kind of bug, but you could also run wireshark and take a look at the server's GETATTR replies to see if there's some other corruption. --b. > Any given kernel will always see the same issue, but after > a "make mrproper" and recompile (with the same .config), the issue will > often change. I suspect that all 3 of these problems are actually the > same issue manifesting itself in different ways depending on what seed > the RANDSTRUCT gcc plugin is using. > > > > > Because in the end, compiler bugs are very rare. They are particularly > > annoying when they do happen, though, so they loom big in the mind of > > people who have had to chase them down. > >