Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 10, 2017 at 08:13:06PM -0500, J. Bruce Fields wrote:
> On Fri, Nov 10, 2017 at 03:26:27PM -0800, Patrick McLean wrote:
> > 
> > 
> > On 2017-11-10 10:42 AM, Linus Torvalds wrote:
> > > On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean <chutzpah@xxxxxxxxxx> wrote:
> > >>
> > >> Something must have changed since 4.13.8 to trigger this though.
> > > 
> > > Arnd pointed to some commits that might be relevant for the cp210x
> > > module, but those are all already in 4.13.8, so if 4.13.8 really is
> > > rock solid for you, I don't think that's it.
> > > 
> > > I really don't see anything that looks even half-way suspicious in
> > > that 4.13.8..11 range. But as mentioned, compiler interactions can be
> > > _really_ subtle.
> > > 
> > > And hey, it can be a real kernel bug too, that just happens to be
> > > exposed by RANDSTRUCT, so a bisect really would be very nice.
> > 
> > I am working on bisecting the issue now, but I think I have some more
> > evidence pointing to a compiler issue related to RANDSTRUCT. There are
> > actually 3 issues that we have seen. Sometimes we get the null pointer
> > deref in the initial message, sometimes we get the GPF, and sometimes we
> > see an issue where the NFS clients see all files as root-owned
> > directories.
> 
> That suggests that stat.uid is 0 and stat.mode & S_IFMT is 0040000 in
> the stat structure that nfsd passed to vfs_getattr().
> 
> No idea what sort of information is useful when tracking down this kind
> of bug, but you could also run wireshark and take a look at the server's
> GETATTR replies to see if there's some other corruption.

FWIW, having looked at some of the __bugger_layout users...  Compiler bugs
aside,
	* use in struct {dentry,inode,mount,block_device} has to go - cache
use patterns at hash lookups are _not_ something to play with like that.
	* struct file_lock and struct super_block - ditto, only it's not
hash lookups that hurt here.  struct vm_area_struct, while we are at it.
	* struct group_info - Cthulhu's pus-leaking warts, what's the point
randomizing _that_?  No, really - here's the damn thing in all its glory:
struct group_info {
        atomic_t        usage;
        int             ngroups;
        kgid_t          gid[0];
} __randomize_layout;
I really hope that plugin does *not* try to move the ->gid[] anywhere...
Which leaves us a choice between putting ->usage first or second.  Sure,
every bit helps, but... even for security theatre that looks a bit too
pathetic.
	* struct vfsmount.  Wow.  All of log2(3!) bits.  Congratulations.
At least that's better than struct path.  Oh, wait - they'd done struct path
as well...

What the hell had they been doing?  Muscarine old-fashioned way?  Looks like
a mix of pointless and truly dangerous.  And then there are compiler bugs and
the charming effect on reproducibility...



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]