On Fri, Mar 08, 2024 at 09:41:38AM -0800, Boris Burkov wrote: > On Fri, Mar 08, 2024 at 08:15:07AM +0530, Anand Jain wrote: > > Boris managed to create a device capable of changing its maj:min without > > altering its device path. > > > > Only multi-devices can be scanned. A device that gets scanned and remains > > in the Btrfs kernel cache might end up with an incorrect maj:min. > > > > Despite the tempfsid feature patch did not introduce this bug, it could > > lead to issues if the above multi-device is converted to a single device > > with a stale maj:min. Subsequently, attempting to mount the same device > > with the correct maj:min might mistake it for another device with the same > > fsid, potentially resulting in wrongly auto-enabling the tempfsid feature. > > > > To address this, this patch validates the device's maj:min at the time of > > device open and updates it if it has changed since the last scan. > > You and Dave have convinced me that it is important to fix this in the > kernel. I still have a hope of simplifying this further, while we are > here and have the code kicking around in our heads. > I don't want to get stuck on this forever, so feel free to add Reviewed-by: Boris Burkov <boris@xxxxxx> However, I would still love to get rid of device->devt if possible. It seems like it might be needed for that other grub bug you fixed. Though perhaps not, since we do skip stale devices in much of the logic. Anyway, let's move forward with this! Thanks for hacking on it with me. > > > > CC: stable@xxxxxxxxxxxxxxx # 6.7+ > > Fixes: a5b8a5f9f835 ("btrfs: support cloned-device mount capability") > > Reported-by: Boris Burkov <boris@xxxxxx> > > Co-developed-by: Boris Burkov <boris@xxxxxx> > > Signed-off-by: Anand Jain <anand.jain@xxxxxxxxxx> > > --- > > v2: > > Drop using lookup_bdev() instead, get it from device->bdev->bd_dev. > > > > v1: > > https://lore.kernel.org/linux-btrfs/752b8526be21d984e0ee58c7f66d312664ff5ac5.1709256891.git.anand.jain@xxxxxxxxxx/ > > > > fs/btrfs/volumes.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > > index e49935a54da0..c318640b4472 100644 > > --- a/fs/btrfs/volumes.c > > +++ b/fs/btrfs/volumes.c > > @@ -692,6 +692,16 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices, > > device->bdev = bdev_handle->bdev; > > clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state); > > > > + if (device->devt != device->bdev->bd_dev) { > > + btrfs_warn(NULL, > > + "device %s maj:min changed from %d:%d to %d:%d", > > + device->name->str, MAJOR(device->devt), > > + MINOR(device->devt), MAJOR(device->bdev->bd_dev), > > + MINOR(device->bdev->bd_dev)); > > + > > + device->devt = device->bdev->bd_dev; > > + } > > + > > If we are permanently maintaining an invariant that device->devt == > device->bdev->bd_dev, do we even need device->devt? As far as I can > tell, all the logic that uses device->devt assumes that the device is > not stale, both in the temp_fsid found_by_devt lookup and in the "device > changed name" check. If so, we could just always use > device->bdev->bd_dev and eliminate this confusion/source of bugs > entirely. > > > fs_devices->open_devices++; > > if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) && > > device->devid != BTRFS_DEV_REPLACE_DEVID) { > > -- > > 2.38.1 > >