On Fri, 30 Jul 2021 at 08:13, NeilBrown <neilb@xxxxxxx> wrote: > > On Fri, 30 Jul 2021, Miklos Szeredi wrote: > > On Fri, 30 Jul 2021 at 07:28, NeilBrown <neilb@xxxxxxx> wrote: > > > > > > On Fri, 30 Jul 2021, Al Viro wrote: > > > > On Wed, Jul 28, 2021 at 08:37:45AM +1000, NeilBrown wrote: > > > > > /proc/$PID/mountinfo contains a field for the device number of the > > > > > filesystem at each mount. > > > > > > > > > > This is taken from the superblock ->s_dev field, which is correct for > > > > > every filesystem except btrfs. A btrfs filesystem can contain multiple > > > > > subvols which each have a different device number. If (a directory > > > > > within) one of these subvols is mounted, the device number reported in > > > > > mountinfo will be different from the device number reported by stat(). > > > > > > > > > > This confuses some libraries and tools such as, historically, findmnt. > > > > > Current findmnt seems to cope with the strangeness. > > > > > > > > > > So instead of using ->s_dev, call vfs_getattr_nosec() and use the ->dev > > > > > provided. As there is no STATX flag to ask for the device number, we > > > > > pass a request mask for zero, and also ask the filesystem to avoid > > > > > syncing with any remote service. > > > > > > > > Hard NAK. You are putting IO (potentially - network IO, with no upper > > > > limit on the completion time) under namespace_sem. > > > > > > Why would IO be generated? The inode must already be in cache because it > > > is mounted, and STATX_DONT_SYNC is passed. If a filesystem did IO in > > > those circumstances, it would be broken. > > > > STATX_DONT_SYNC is a hint, and while some network fs do honor it, not all do. > > > > That's ... unfortunate. Rather seems to spoil the whole point of having > a flag like that. Maybe it should have been called > "STATX_SYNC_OR_SYNC_NOT_THERE_IS_NO_GUARANTEE" And I guess just about every filesystem would need to be fixed to prevent starting I/O on STATX_DONT_SYNC, as block I/O could just as well generate network traffic. Probably much easier fix btrfs to use some sort of subvolume structure that the VFS knows about. I think there's been talk about that for a long time, not sure where it got stalled. Thanks, Miklos