[ adding btrfs ] On Tue, Feb 2, 2016 at 3:19 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > On Tue, Feb 02, 2016 at 04:11:42PM -0700, Ross Zwisler wrote: > >> However, for raw block devices and for XFS with a real-time device, the >> value in inode->i_sb->s_bdev is not correct. With the code as it is >> currently written, an fsync or msync to a DAX enabled raw block device will >> cause a NULL pointer dereference kernel BUG. For this to work correctly we >> need to ask the block device or filesystem what struct block_device is >> appropriate for our inode. >> >> To that end, add a get_bdev(struct inode *) entry point to struct >> super_operations. If this function pointer is non-NULL, this notifies DAX >> that it needs to use it to look up the correct block_device. If >> i_sb->get_bdev() is NULL DAX will default to inode->i_sb->s_bdev. > > Umm... It assumes that bdev will stay pinned for as long as inode is > referenced, presumably? If so, that needs to be documented (and verified > for existing fs instances). In principle, multi-disk fs might want to > support things like "silently move the inodes backed by that disk to other > ones"... I assume btrfs is the only fs we have that might reassign the bdev for a given inode on the fly? Hopefully we don't need anything stronger than rcu_read_lock() to pin the result as valid. At least in this case the initial user is dax-fsync where the ->get_bdev() answer should be static for the life of the inode, and btrfs does not currently interface with dax. But yes, we need to get the expected semantics clear. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html