Re: Re: [PATCH 1/2] rbd: RBD_DEV_FLAG_THICK rbd_dev_flags bit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 23, 2018 at 10:31 AM, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
> On Thu, Mar 22, 2018 at 12:57 PM, 亀井仁志 / KAMEI,HITOSHI
> <hitoshi.kamei.xm@xxxxxxxxxxx> wrote:
>> Hi Yang,
>>
>>> I am not sure is this the best way for this case, what about adding an option in "rbd map -o thick rbd/test"?
>>
>> I will add such option to the rbd map command to manipulate image settings. So, the end-user
>> do not change the settings directly via sysfs file.
>>
>>>       @@ -4011,6 +4012,15 @@ static void rbd_queue_workfn(struct work_struct *work)
>>>                       goto err;
>>>               }
>>>
>>>       +       /* Ignore/skip discard requests for thick-provision image */
>>>
>>> Just ignore? or return -EOPNOTSUPP?
>>
>> Thanks, I think -EOPNOTSUPP is better because user programs cannot know
>> the result of requested operation when the kernel rbd driver ignores
>> discard request. The result of requested operation when the kernel rbd driver
>> ignores discard requests, which probably misleads the user programs.
>>
>>> In addition, we should not ignore the REQ_OP_WRITE_ZEROES.
>>
>> Relating to the above, the return code of REQ_OP_WRITE_ZEROS request
>> is also -EOPNOTSUPP instead of ignoring. I think the result of
>> -EOPNOTSUPP is also better for this request because the kernel
>> rbd driver can expect that user programs write zero data by itself.
>
> REQ_OP_WRITE_ZEROES should continue to work, we just need to make
> sure it never issues truncates or deletes and instead writes zeroes
> explicitly.
>
> I think we should be explicit about the fact that discard is not
> supported instead of accepting the discard request and failing it in
> rbd_queue_workfn().  Attached patch is what I have in mind.

Now with the patch.

Thanks,

                Ilya
From 1db405366371cb12e5182815c06f6f491af4b63c Mon Sep 17 00:00:00 2001
From: Ilya Dryomov <idryomov@xxxxxxxxx>
Date: Fri, 23 Mar 2018 09:14:47 +0100
Subject: [PATCH] rbd: notrim map option

Add an option to turn off discard and write zeroes offload support to
avoid deprovisioning a fully provisioned image.  When enabled, discard
requests will fail with -EOPNOTSUPP, write zeroes requests will fall
back to manually zeroing.

Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx>
---
 drivers/block/rbd.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 1a90e5dd7470..dbe440a6531e 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -733,6 +733,7 @@ enum {
 	Opt_read_write,
 	Opt_lock_on_read,
 	Opt_exclusive,
+	Opt_notrim,
 	Opt_err
 };
 
@@ -746,6 +747,7 @@ static match_table_t rbd_opts_tokens = {
 	{Opt_read_write, "rw"},		/* Alternate spelling */
 	{Opt_lock_on_read, "lock_on_read"},
 	{Opt_exclusive, "exclusive"},
+	{Opt_notrim, "notrim"},
 	{Opt_err, NULL}
 };
 
@@ -754,12 +756,14 @@ struct rbd_options {
 	bool	read_only;
 	bool	lock_on_read;
 	bool	exclusive;
+	bool	trim;
 };
 
 #define RBD_QUEUE_DEPTH_DEFAULT	BLKDEV_MAX_RQ
 #define RBD_READ_ONLY_DEFAULT	false
 #define RBD_LOCK_ON_READ_DEFAULT false
 #define RBD_EXCLUSIVE_DEFAULT	false
+#define RBD_TRIM_DEFAULT	true
 
 static int parse_rbd_opts_token(char *c, void *private)
 {
@@ -801,6 +805,9 @@ static int parse_rbd_opts_token(char *c, void *private)
 	case Opt_exclusive:
 		rbd_opts->exclusive = true;
 		break;
+	case Opt_notrim:
+		rbd_opts->trim = false;
+		break;
 	default:
 		/* libceph prints "bad option" msg */
 		return -EINVAL;
@@ -3940,11 +3947,12 @@ static int rbd_init_disk(struct rbd_device *rbd_dev)
 	blk_queue_io_min(q, objset_bytes);
 	blk_queue_io_opt(q, objset_bytes);
 
-	/* enable the discard support */
-	queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
-	q->limits.discard_granularity = objset_bytes;
-	blk_queue_max_discard_sectors(q, objset_bytes >> SECTOR_SHIFT);
-	blk_queue_max_write_zeroes_sectors(q, objset_bytes >> SECTOR_SHIFT);
+	if (rbd_dev->opts->trim) {
+		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
+		q->limits.discard_granularity = objset_bytes;
+		blk_queue_max_discard_sectors(q, objset_bytes >> SECTOR_SHIFT);
+		blk_queue_max_write_zeroes_sectors(q, objset_bytes >> SECTOR_SHIFT);
+	}
 
 	if (!ceph_test_opt(rbd_dev->rbd_client->client, NOCRC))
 		q->backing_dev_info->capabilities |= BDI_CAP_STABLE_WRITES;
@@ -5170,6 +5178,7 @@ static int rbd_add_parse_args(const char *buf,
 	rbd_opts->queue_depth = RBD_QUEUE_DEPTH_DEFAULT;
 	rbd_opts->lock_on_read = RBD_LOCK_ON_READ_DEFAULT;
 	rbd_opts->exclusive = RBD_EXCLUSIVE_DEFAULT;
+	rbd_opts->trim = RBD_TRIM_DEFAULT;
 
 	copts = ceph_parse_options(options, mon_addrs,
 					mon_addrs + mon_addrs_size - 1,
-- 
2.4.3


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux