On Mon, Feb 05, 2024 at 06:43:39PM +0100, David Sterba wrote: > There are reports that since version 6.7 update-grub fails to find the > device of the root on systems without initrd and on a single device. > > This looks like the device name changed in the output of > /proc/self/mountinfo: > > 6.5-rc5 working > > 18 1 0:16 / / rw,noatime - btrfs /dev/sda8 ... > > 6.7 not working: > > 17 1 0:15 / / rw,noatime - btrfs /dev/root ... > > and "update-grub" shows this error: > > /usr/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?) > > This looks like it's related to the device name, but grub-probe > recognizes the "/dev/root" path and tries to find the underlying device. > However there's a special case for some filesystems, for btrfs in > particular. > > The generic root device detection heuristic is not done and it all > relies on reading the device infos by a btrfs specific ioctl. This ioctl > returns the device name as it was saved at the time of device scan (in > this case it's /dev/root). > > The change in 6.7 for temp_fsid to allow several single device > filesystem to exist with the same fsid (and transparently generate a new > UUID at mount time) was to skip caching/registering such devices. > > This also skipped mounted device. One step of scanning is to check if > the device name hasn't changed, and if yes then update the cached value. > > This broke the grub-probe as it always read the device /dev/root and > couldn't find it in the system. A temporary workaround is to create a > symlink but this does not survive reboot. > > The right fix is to allow updating the device path of a mounted > filesystem even if this is a single device one. This does not affect the > temp_fsid feature, the UUID of the mounted filesystem remains the same > and the matching is based on device major:minor which is unique per > mounted filesystem. > > As the main part of device scanning and list update is done in > device_list_add() that handles all corner cases and locking, it is > extended to take a parameter that tells it to do everything as before, > except adding a new device entry. > > This covers the path when the device (that exists for all mounted > devices) name changes, updating /dev/root to /dev/sdx. Any other single > device with filesystem is skipped. > > Note that if a system is booted and initial mount is done on the > /dev/root device, this will be the cached name of the device. Only after > the command "btrfs device rescan" it will change as it triggers the > rename. > > The fix was verified by users whose systems were affected. > > CC: stable@xxxxxxxxxxxxxxx # 6.7+ > Fixes: bc27d6f0aa0e ("btrfs: scan but don't register device on single device filesystem") > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=218353 > Link: https://lore.kernel.org/lkml/CAKLYgeJ1tUuqLcsquwuFqjDXPSJpEiokrWK2gisPKDZLs8Y2TQ@xxxxxxxxxxxxxx/ > Signed-off-by: David Sterba <dsterba@xxxxxxxx> Reviewed-by: Boris Burkov <boris@xxxxxx> > --- > fs/btrfs/volumes.c | 30 ++++++++++++++---------------- > 1 file changed, 14 insertions(+), 16 deletions(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 474ab7ed65ea..f2c2f7ca5c3d 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -738,6 +738,7 @@ static noinline struct btrfs_device *device_list_add(const char *path, > bool same_fsid_diff_dev = false; > bool has_metadata_uuid = (btrfs_super_incompat_flags(disk_super) & > BTRFS_FEATURE_INCOMPAT_METADATA_UUID); > + bool can_create_new = *new_device_added; It took me quite a while to figure out all the intended logic with the now in/out parameter. I think it's probably too cute? Why not just add another parameter "can_create_new_device"? I think it feels kind of weird on the caller side too, to set "new_device_added" to true, but then still rely on it to actually get set to true. Once I got past this, the logic made sense, so definitely don't block yourself on this nit. > > if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_CHANGING_FSID_V2) { > btrfs_err(NULL, > @@ -753,6 +754,7 @@ static noinline struct btrfs_device *device_list_add(const char *path, > return ERR_PTR(error); > } > > + *new_device_added = false; > fs_devices = find_fsid_by_device(disk_super, path_devt, &same_fsid_diff_dev); > > if (!fs_devices) { > @@ -804,6 +806,15 @@ static noinline struct btrfs_device *device_list_add(const char *path, > return ERR_PTR(-EBUSY); > } > > + if (!can_create_new) { > + pr_info( > + "BTRFS: device fsid %pU devid %llu transid %llu %s skip registration scanned by %s (%d)\n", > + disk_super->fsid, devid, found_transid, path, > + current->comm, task_pid_nr(current)); > + mutex_unlock(&fs_devices->device_list_mutex); > + return NULL; > + } > + > nofs_flag = memalloc_nofs_save(); > device = btrfs_alloc_device(NULL, &devid, > disk_super->dev_item.uuid, path); > @@ -1355,27 +1366,14 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags, > goto error_bdev_put; > } > > - if (!mount_arg_dev && btrfs_super_num_devices(disk_super) == 1 && > - !(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)) { > - dev_t devt; > - > - ret = lookup_bdev(path, &devt); > - if (ret) > - btrfs_warn(NULL, "lookup bdev failed for path %s: %d", > - path, ret); > - else > - btrfs_free_stale_devices(devt, NULL); > - > - pr_debug("BTRFS: skip registering single non-seed device %s\n", path); > - device = NULL; > - goto free_disk_super; > - } > + if (mount_arg_dev || btrfs_super_num_devices(disk_super) != 1 || > + (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)) > + new_device_added = true; This is the line I was referring to in the comment above, fwiw. > > device = device_list_add(path, disk_super, &new_device_added); > if (!IS_ERR(device) && new_device_added) > btrfs_free_stale_devices(device->devt, device); > > -free_disk_super: > btrfs_release_disk_super(disk_super); > > error_bdev_put: > > -- > 2.42.1 >