On 09/18/2013 01:35 AM, Guangliang Zhao wrote: > When running the following commands: > [root@ceph0 mnt]# blockdev --setro /dev/rbd1 > [root@ceph0 mnt]# blockdev --getro /dev/rbd1 > 0 > > The block setro didn't take effect, it is because > the rbd doesn't support ioctl of block driver. Nicely done. I have a couple small comments below, and one simple change that needs to be made. I also point out another issue about the use of the spinlock to protect against read-only state changes; I'd like Josh to weigh in on how he thinks it might best be handled. Provided you make the simple change, and Josh has no problem with the read-only state thing: Reviewed-by: Alex Elder <elder@xxxxxxxxxx> The required change is something Josh mentioned. In rbd_request_fn(), the variable read_only is defined and assigned at the top of the function. That line needs to be inside the while loop, to ensure that the most up-to-date value is used (in case it gets changed between requests). > The results with this patch: > root@ceph-client01:~# rbd map img > root@ceph-client01:~# blockdev --getro /dev/rbd1 > 0 > root@ceph-client01:~# cat /sys/block/rbd1/ro > 0 > root@ceph-client01:~# blockdev --setro /dev/rbd1 > root@ceph-client01:~# blockdev --getro /dev/rbd1 > 1 > root@ceph-client01:~# cat /sys/block/rbd1/ro > 1 > root@ceph-client01:~# dd if=/dev/zero of=/dev/rbd1 count=1 > dd: opening `/dev/rbd1': Read-only file system` > root@ceph-client01:~# blockdev --setrw /dev/rbd1 > root@ceph-client01:~# blockdev --getro /dev/rbd1 > 0 > root@ceph-client01:~# cat /sys/block/rbd1/ro > 0 > root@ceph-client01:~# dd if=/dev/zero of=/dev/rbd1 count=1 > 1+0 records in > 1+0 records out > 512 bytes (512 B) copied, 0.14989 s, 3.4 kB/s I don't necessarily want to see all this testing output in the commit description (a summary would suffice). As Josh suggested, some sort of test script to validate all of this would be much appreciated. It can be very simple: - map rbd device - run "blockdev --getro" on it, and verify it reports 0 - run "blockdev --setro" and verify it is now read-only ... and so on. Tests we suggested, which I expect will pass but I don't see evidence of it in your description above: - verify changing the state (setro or setrw) fails on a device that's already open (e.g. mounted) (EBUSY) - verify setro on a snapshot that's mounted (EBUSY) - verify setrw on a non-snapshot image that is mapped read-only (EROFS) - verify setrw on a snapshot fails (EROFS) > This resolves: > http://tracker.ceph.com/issues/6265 > > Signed-off-by: Guangliang Zhao <guangliang@xxxxxxxxxxxxxxx> > --- > drivers/block/rbd.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 59 insertions(+) > > diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c > index 2f00778..fea826d 100644 > --- a/drivers/block/rbd.c > +++ b/drivers/block/rbd.c > @@ -508,10 +508,69 @@ static void rbd_release(struct gendisk *disk, fmode_t mode) > put_device(&rbd_dev->dev); > } > > +static int rbd_ioctl_set_ro(struct rbd_device *rbd_dev, unsigned long arg) > +{ > + int val; > + bool ro; > + > + if (get_user(val, (int __user *)(arg))) > + return -EFAULT; > + > + ro = val ? true : false; > + /* Snapshot doesn't allow to write*/ > + if (rbd_dev->spec->snap_id != CEPH_NOSNAP && !ro) > + return -EROFS; > + > + if (rbd_dev->mapping.read_only != ro) { > + rbd_dev->mapping.read_only = ro; > + set_disk_ro(rbd_dev->disk, ro ? 1 : 0); This is interesting. Josh mentioned that the set_device_ro() call you had here previously was not needed, because the generic ioctl code handled it. But I believe calling set_disk_ro() (which triggers a udev event) is the right thing to do, even though rbd doesn't support partitions. In fact, it might be more appropriate for the generic code to call that using bdev->bd_disk. Hmmm. (I don't have time to look into this any further right now, maybe if someone thinks it's worth pursuing they can do so.) In the mean time this does more than set_device_ro(), and the redundancy is harmless. HOWEVER I believe that calling this while holding the spinlock will cause locking problems (or may just be a bad idea). I don't know for sure; let lockdep be your guide. > + } > + > + return 0; > +} > + > +static int rbd_ioctl(struct block_device *bdev, fmode_t mode, > + unsigned int cmd, unsigned long arg) > +{ > + struct rbd_device *rbd_dev = bdev->bd_disk->private_data; > + int ret = 0; > + > + spin_lock_irq(&rbd_dev->lock); > + /* prevent others open this device */ I think indicating something more like "make sure we hold the only reference to this device" would make it clear why you're checking against 1 rather than 0. > + if (rbd_dev->open_count > 1) { > + ret = -EBUSY; > + goto out; > + } You are adding a new ability to change a fundamental state for an image. You're using the spinlock now to protect the read-only state, where previously it was only used to protect the open count and REMOVING state/flag. This may mean you should check the read_only flag in rbd_open() inside the spinlock there. And you may need to call the set_device_ro() there under protection of that lock. As I mentioned above, holding the lock could also lead to a problem when calling outside code (set_disk_ro()). If that's the case you may need to devise a different (possibly additional) way to protect this. Josh, please offer your insights. > + > + switch (cmd) { > + case BLKROSET: > + ret = rbd_ioctl_set_ro(rbd_dev, arg); > + break; > + default: > + ret = -ENOTTY; > + } > + > +out: > + spin_unlock_irq(&rbd_dev->lock); > + return ret; > +} > + > +#ifdef CONFIG_COMPAT > +static int rbd_compat_ioctl(struct block_device *bdev, fmode_t mode, > + unsigned int cmd, unsigned long arg) > +{ > + return rbd_ioctl(bdev, mode, cmd, arg); > +} > +#endif /* CONFIG_COMPAT */ > + > static const struct block_device_operations rbd_bd_ops = { > .owner = THIS_MODULE, > .open = rbd_open, > .release = rbd_release, > + .ioctl = rbd_ioctl, > +#ifdef CONFIG_COMPAT > + .compat_ioctl = rbd_compat_ioctl, > +#endif > }; > > /* > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html