Re: btrfs regression since 4.X kernel NULL pointer dereference

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Am 11.09.2015 um 21:05 schrieb Jeff Mahoney:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 9/11/15 2:55 PM, Jeff Mahoney wrote:
On 8/25/15 5:00 AM, Christoph Hellwig wrote:
I think this is btrfs using a struct block_device that doesn't
have a valid queue pointer in it's gendisk for ->s_bdev.  And
there are some fishy looking ->s_bdev assignments in the code
which I suspect are related to it:

fs/btrfs/dev-replace.c: if (fs_info->sb->s_bdev ==
src_device->bdev) fs/btrfs/dev-replace.c: fs_info->sb->s_bdev =
tgt_device->bdev; fs/btrfs/volumes.c:     if (device->bdev ==
root->fs_info->sb->s_bdev) fs/btrfs/volumes.c:
root->fs_info->sb->s_bdev = next_device->bdev;
fs/btrfs/volumes.c: if (tgtdev->bdev == fs_info->sb->s_bdev)
fs/btrfs/volumes.c: fs_info->sb->s_bdev = next_device->bdev;

The report at https://bugzilla.kernel.org/show_bug.cgi?id=100911
tracks it down a bit further and it's bdev->bd_disk == NULL instead
of the queue in the gendisk. I don't think that the s_bdev stuff
is related, though I'd certainly love to see that bit go away.

If we're calling blk_get_backing_dev_info, that means we're
already using an inode that has blockdev_superblock and the btrfs
superblock isn't even involved.

We're getting there because btrfs_evict_inode ->
btrfs_wait_ordered_range -> btrfs_fdatawrite_range ->
filemap_fdatawrite_range gets called with inode->i_mapping.  That
mapping gets passed down through __filemap_fdatawrite_range to
wbc_attach_fdatawrite_inode where the inode passed is mapping->host
-- which will be the block device inode rather than the btrfs
device node inode.  That inode is the one ultimately checked in
inode_to_bdi.

So it looks like we're causing writeback on an unrelated block
device that was opened using a device node hosted on btrfs, which
is obviously wrong.

I don't think snapshot removal is even a requirement to trigger
this. I expect it's possible to trigger with two device nodes for
the same block device where one is getting closed and cleaned up
while the eviction of the other happens.  The device nodes wouldn't
even need to be on the same fs.

Other file systems use &inode->i_data in eviction.  Is it that
simple here?

Your patch works fine here. Did some simple tests already.

Thanks!

Stefan


Incidentally, this explanation also covers why I was unable to
reproduce it locally.  SLES systems use devtmpfs and I just bind
mounted it into my chroot environment like I normally would.  When I
cp'd /dev into the test environment, I was able to reproduce immediately
.

- -Jeff

- --
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJV8yYAAAoJEB57S2MheeWy2BwP+QGdpsErIfyHJcx95LLrvsxu
n0kBoI4Jd5yfNxp8m+Ll3xgUdsd6rKHJV2Muq8aRdNEdzf1E0DFrRcE0d1W5UrJy
lPzrA8QxCVaLf5jFysFp0xygKbLKHGmOAv2KnAGYFw6exIjb344UnZb6aiw5Uekm
DqrTmEq+0Yb/mE04GVpWMylK6pkDOhgkOzFVZa1Pff0eKY4E61G5GtmA2kNAUP9v
CsoZ0FO1WdF2Fc9ONSPjq7FdZLKH+OmIVakHnaELa8EEM3W7NU+mxLRabznBV25e
L/KPjr+awzkhV1ieyAAww/dddE3bN5nmDOq+OgvA9WPgaRvvwne2tHVTFaxHoiHg
d8oHDLkC1/Z1MqINLi5dZNsSuIWMvRhIMV9Th5F2rdWxrBCSRvID7N+Z2HHh6mJC
Q9rgSOyYKclTam6IF7yX8lDWIqkAnoA6OxvOKRccgr3hS/u4DzVtRmWHO9RblEi+
a9dF2FCP+v+Lgdb8C5n7XUixrtF5H6BWHhmArgjmxD6iyeXOmphyGrgqmSLdY1s9
sakvLrSB9i3O27CKoup2OHyF6MOdgsaa90FZLPLt6BDrCTWAscd0LDy8MbaKgKCR
kjfSTiwNydzZfkJixH71U/1mGbuB9nqf6jrNWCQdE5f57MSCEwiFqQvaD1KK+Uug
ZW2Bz1VQxkOvGbYiJ4HV
=ic4+
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux