Re: kernel BUG at fs/xfs/xfs_aops.c:853! in kernel 4.13 rc6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 17, 2017 at 11:20:17AM +0200, Jan Kara wrote:
> The operation we are speaking about here is different. It is more along the
> lines of "release this device".  And in the current world of containers,
> mount namespaces, etc. it is not trivial for userspace to implement this
> using umount(2) as Ted points out. I believe we could do that by walking
> through all mount points of a superblock and unmounting them (and I don't
> want to get into a discussion how to efficiently implement that now but in
> principle the kernel has all the necessary information).

Yes, this is what I want.  And regardless of how efficiently or not
the kernel can implement such an operatoin, by definition it will be
more efficient than if we ahve to do it in userspace.  (And I don't
think it has to be super-efficient, since this is not a hot-path.  So
for the record, I wouldn't want to add any extra linked list
references, etc.)

> What I'm a bit concerned about is the "release device reference" part - for
> a block device to stop looking busy we have to do that however then the
> block device can go away and the filesystem isn't prepared to that - we
> reference sb->s_bdev in lots of places, we have buffer heads which are part
> of bdev page cache, and probably other indirect assumptions I forgot about
> now. One solution to this is to not just stop accessing the device but
> truly cleanup the filesystem up to a point where it is practically
> unmounted. I like this solution more but we have to be careful to block
> any access attemps high enough in VFS ideally before ever entering fs code.

Right, so first step would be to block access attempts high up in the
VFS.  The second would be to point any file descriptors at a revoked
NULL struct file, also redirect any task struct's CWD so it is as if
the directory had gotten rmdir'ed, and also munmap any mapped regions.
At that point, all of the file descriptors will be closed.  The third
step would be to do a syncfs(), which will force out any dirty pages.
And then finally, to call umount() in all of the namespaces, which
will naturally take care of any buffer or page cache references once
the ref count of the struct super goes to zero.

This all doesn't have to be a single system call.  Perhaps it would
make sense for first and second step to be one system call --- call it
revokefs(2), perhaps.  And then the last step could be another system
call --- maybe umountall(2).

> Another option would be to do something similar to what we do when the
> device just gets unplugged under our hands - we detach bdev from gendisk,
> leave it dangling and invisible. But we would still somehow have to
> convince DM that the bdev practically went away by calling
> disk->fops->release() and it all just seems fragile to me. But I wanted to
> mention this option in case the above solution proves to be too difficult.

Yeah, that's similarly as fragile as using the ext4/xfs/f2fs
shutdown/goingdown ioctl.  In order to do this right I really think we
need to get the VFS involved, so it can be a real, clean unmount, as
opposed to something where we just rip the file system away from the
bdev.

						- Ted
						
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux