Re: [PATCH v3] loop: fix no-unmap write-zeroes request behavior

Eric Sandeen <sandeen@xxxxxxxxxxx> · Mon, 14 Oct 2019 11:39:43 -0500

On 10/14/19 10:50 AM, Darrick J. Wong wrote:
From: Darrick J. Wong <darrick.wong@xxxxxxxxxx>

Currently, if the loop device receives a WRITE_ZEROES request, it asks
the underlying filesystem to punch out the range.  This behavior is
correct if unmapping is allowed.  However, a NOUNMAP request means that
the caller doesn't want us to free the storage backing the range, so
punching out the range is incorrect behavior.

To satisfy a NOUNMAP | WRITE_ZEROES request, loop should ask the
underlying filesystem to FALLOC_FL_ZERO_RANGE, which is (according to
the fallocate documentation) required to ensure that the entire range is
backed by real storage, which suffices for our purposes.

Fixes: 19372e2769179dd ("loop: implement REQ_OP_WRITE_ZEROES")
Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
---
v3: refactor into a single fallocate function
v2: reorganize a little according to hch feedback
---
  drivers/block/loop.c |   26 ++++++++++++++++++--------
  1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index f6f77eaa7217..ef6e251857c8 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -417,18 +417,20 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq,
  	return ret;
  }
  
-static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos)
+static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos,
+			int mode)
  {
  	/*
-	 * We use punch hole to reclaim the free space used by the
-	 * image a.k.a. discard. However we do not support discard if
-	 * encryption is enabled, because it may give an attacker
-	 * useful information.
+	 * We use fallocate to manipulate the space mappings used by the image
+	 * a.k.a. discard/zerorange. However we do not support this if
+	 * encryption is enabled, because it may give an attacker useful
+	 * information.
  	 */
  	struct file *file = lo->lo_backing_file;
-	int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE;
  	int ret;
  
+	mode |= FALLOC_FL_KEEP_SIZE;
+
  	if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
  		ret = -EOPNOTSUPP;
  		goto out;
@@ -596,9 +598,17 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq)
  	switch (req_op(rq)) {
  	case REQ_OP_FLUSH:
  		return lo_req_flush(lo, rq);
-	case REQ_OP_DISCARD:
  	case REQ_OP_WRITE_ZEROES:
-		return lo_discard(lo, rq, pos);
cxz ÿbvVBV
+	case REQ_OP_DISCARD:
+		return lo_fallocate(lo, rq, pos, FALLOC_FL_PUNCH_HOLE);

I get lost in the twisty passages.  What happens if the filesystem hosting the
backing file doesn't support fallocate, and REQ_OP_DISCARD / REQ_OP_WRITE_ZEROES
returns EOPNOTSUPP - discard is advisory, is it ok to fail REQ_OP_WRITE_ZEROES?
Does something at another layer fall back to writing zeros?

-Eric

  	case REQ_OP_WRITE:
  		if (lo->transfer)
  			return lo_write_transfer(lo, rq, pos);