On Mon, Oct 14, 2019 at 11:39:43AM -0500, Eric Sandeen wrote: > On 10/14/19 10:50 AM, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > > Currently, if the loop device receives a WRITE_ZEROES request, it asks > > the underlying filesystem to punch out the range. This behavior is > > correct if unmapping is allowed. However, a NOUNMAP request means that > > the caller doesn't want us to free the storage backing the range, so > > punching out the range is incorrect behavior. > > > > To satisfy a NOUNMAP | WRITE_ZEROES request, loop should ask the > > underlying filesystem to FALLOC_FL_ZERO_RANGE, which is (according to > > the fallocate documentation) required to ensure that the entire range is > > backed by real storage, which suffices for our purposes. > > > > Fixes: 19372e2769179dd ("loop: implement REQ_OP_WRITE_ZEROES") > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > --- > > v3: refactor into a single fallocate function > > v2: reorganize a little according to hch feedback > > --- > > drivers/block/loop.c | 26 ++++++++++++++++++-------- > > 1 file changed, 18 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > > index f6f77eaa7217..ef6e251857c8 100644 > > --- a/drivers/block/loop.c > > +++ b/drivers/block/loop.c > > @@ -417,18 +417,20 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq, > > return ret; > > } > > -static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos) > > +static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos, > > + int mode) > > { > > /* > > - * We use punch hole to reclaim the free space used by the > > - * image a.k.a. discard. However we do not support discard if > > - * encryption is enabled, because it may give an attacker > > - * useful information. > > + * We use fallocate to manipulate the space mappings used by the image > > + * a.k.a. discard/zerorange. However we do not support this if > > + * encryption is enabled, because it may give an attacker useful > > + * information. > > */ > > struct file *file = lo->lo_backing_file; > > - int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE; > > int ret; > > + mode |= FALLOC_FL_KEEP_SIZE; > > + > > if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > > ret = -EOPNOTSUPP; > > goto out; > > @@ -596,9 +598,17 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) > > switch (req_op(rq)) { > > case REQ_OP_FLUSH: > > return lo_req_flush(lo, rq); > > - case REQ_OP_DISCARD: > > case REQ_OP_WRITE_ZEROES: > > - return lo_discard(lo, rq, pos); > cxz ÿbvVBV Yes. > > + case REQ_OP_DISCARD: > > + return lo_fallocate(lo, rq, pos, FALLOC_FL_PUNCH_HOLE); > > I get lost in the twisty passages. What happens if the filesystem hosting the > backing file doesn't support fallocate, and REQ_OP_DISCARD / REQ_OP_WRITE_ZEROES > returns EOPNOTSUPP - discard is advisory, is it ok to fail REQ_OP_WRITE_ZEROES? > Does something at another layer fall back to writing zeros? If the REQ_OP_WRITE_ZEROES request was initiated by blkdev_issue_zeroout and we send back an error code, blkdev_issue_zeroout will fall back to writing zeroes if BLKDEV_ZERO_NOFALLBACK wasn't set its caller. Note that calling FALLOC_FL_ZERO_RANGE on a block device will generate a REQ_OP_WRITE_ZEROES | REQ_OP_NOUNMAP request, which means that it will try fallocate zeroing and fall back to writing zeroes. --D > > -Eric > > > case REQ_OP_WRITE: > > if (lo->transfer) > > return lo_write_transfer(lo, rq, pos); > >