Re: [PATCH] fs: buffer: Check to avoid NULL pointer dereference of returned buffer_head for a private page

Ron Enfield <rhmcruiser@xxxxxxxxx> · Sun, 18 Aug 2019 14:57:40 +1000

On Sun, Aug 18, 2019 at 2:31 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Aug 15, 2019 at 01:09:11AM +1000, Monthero Ronald wrote:
> > The patch checks for this condition of NULL pointer for the buffer_head returned from page_buffers()
> > and also a check placed within the list traversal loop for next buffer_head structs.
> >
> > crash scenario:
> > The buffer_head returned from page_buffers() is not checked in block_invalidatepage_range function.
> > The struct buffer_head* pointer returned by page_buffers(page) was 0x0, although this page had its
> > private flag PG_private bit set and was expected to have buffer_head structs attached.The NULL pointer
> > buffer_head was dereferenced in block_invalidatepage_range function at bh->b_size, where bh returned by
> > page_buffers(page) was 0x0.
> >
> > The stack frames were  truncate_inode_page() => do_invalidatepage_range() => xfs_vm_invalidatepage() =>
> >           [exception RIP: block_invalidatepage_range+132]
> >
> > The inode for truncate in this case was valid and had  proper inode.i_state = 0x20 - FREEING and had
> > a valid mapped address space to xfs. And the struct page in context of block_invalidatepage_range()
> > had its page flag PG_private set but the page.private was 0x0. So page_buffers(page) returned 0x0
> > and hence the crash.
> > This patch performs NULL pointer check for returned buffer_head. Applies to 3.16 and later kernels.
>
> ... and adds BUG_ON() for that.  The only real difference from an oops is
> that it's a bit easier to recognize.  Which may or may not be a good
> debugging strategy, but what's the point of having it in -stable?  Or
> anywhere other than the build on the boxen you are testing on...
>
> It doesn't fix the underlying bug.  It doesn't tell where the problem
> is.  It's definitely *not* a way to fix any bugs.  And while we
> are at it, the stuff in -stable ought to be backports from mainline.
>
> Can you reproduce your crashes on mainline?

No I don't have a reproducer.  This was an abrupt crash, where the
page has its private flag set but the buffer_head returned by
page_buffers( ) was 0x0. Having the BUG_ON check for the returned
buffer_head from page_buffers( ) would help to to see the BUG line
readily than to analyze the vmcore  and check the fault condition why
kernel panicked. Two cents here, but open to thoughts.
( This crash to reproduce is not trivial, nevertheless will attempt to
check if there are any mm tests or page mapping tests that can induce
this condition )