On Wed, Apr 12 2017, Shaohua Li wrote: > On Thu, Apr 13, 2017 at 08:53:48AM +1000, Neil Brown wrote: >> >> 1/ If an array has any read-only devices when it is started, >> the array itself must be read-only >> 2/ A read-only device cannot be added to an array after it is >> started. >> 3/ Setting an array to read-write should not succeed >> if any member devices are read-only > > Didn't get these. We call md_import_device() first to open under layer disk. We > always use FMOD_READ|FMOD_WRITE to open the disk. So if the disk is ro, > md_import_device should fail, we don't add the disk to the array. Why would we > have such issues? > Because life isn't always as simple as we might like it to be. :-( md_import_device() calls lock_rdev() which calls blkdev_get_by_dev(). blkdev_get_by_dev() doesn't pay much attention to the mode, nor does blkdev_get() which it calls. The main place where FMODE_WRITE could be rejected on a read-only device is in the device's 'open()' function. A few open functions do check for read-only, but it isn't at all consistent. scsi/sd.c does, block/loop.c doesn't, nor does nvme. Most drivers seem to ignore the mode. blkdev_get_by_path() has if ((mode & FMODE_WRITE) && bdev_read_only(bdev)) { blkdev_put(bdev, mode); return ERR_PTR(-EACCES); } so when you open a device by path name you always get this check, but not when you open a device by device-number like md does. It is worth having a look at Commit: e51900f7d38c ("block: revert block_dev read-only check") from 2011. The bdev_read_only() check was in blkdev_get() for a while, but it was moved out because doing that broke md and dm and others. So at present, callers of blkdev_get_by_dev() need to do their own bdev_read_only() tests before writing. We could discuss where in md.c is the best place to put them, but unless you want to take on a largish project to 'fix' (or audit) all callers of blkdev_get_by_dev(), they need to go in md somewhere. Thanks, NeilBrown
Attachment:
signature.asc
Description: PGP signature