Re: [PATCH V3 3/4] md/raid10: improve raid10 discard request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Song


On 08/20/2020 02:36 AM, Song Liu wrote:
On Thu, Aug 13, 2020 at 1:15 AM Xiao Ni <xni@xxxxxxxxxx> wrote:
[...]
Reviewed-by: Coly Li <colyli@xxxxxxx>
Reviewed-by: Guoqing Jiang <guoqing.jiang@xxxxxxxxxxxxxxx>
Signed-off-by: Xiao Ni <xni@xxxxxxxxxx>

Please add "---" before "v1:..." so that this part is ignored during git am.
Ok

v1:
Coly helps to review these patches and give some suggestions:
One bug is found. If discard bio is across one stripe but bio size is bigger
than one stripe size. After spliting, the bio will be NULL. In this version,
it checks whether discard bio size is bigger than 2*stripe_size.
In raid10_end_discard_request, it's better to check R10BIO_Uptodate is set
or not. It can avoid write memory to improve performance.
Add more comments for calculating addresses.

v2:
Fix error by checkpatch.pl
Fix one bug for offset layout. v1 calculates wrongly split size
Add more comments to explain how the discard range of each component disk
is decided.
---
  drivers/md/raid10.c | 287 +++++++++++++++++++++++++++++++++++++++++++++++++++-
  1 file changed, 286 insertions(+), 1 deletion(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index cef3cb8..5431e1b 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1526,6 +1526,287 @@ static void __make_request(struct mddev *mddev, struct bio *bio, int sectors)
                 raid10_write_request(mddev, bio, r10_bio);
  }

+static void wait_blocked_dev(struct mddev *mddev, int cnt)
+{
+       int i;
+       struct r10conf *conf = mddev->private;
+       struct md_rdev *blocked_rdev = NULL;
+
+retry_wait:
+       rcu_read_lock();
+       for (i = 0; i < cnt; i++) {
+               struct md_rdev *rdev = rcu_dereference(conf->mirrors[i].rdev);
+               struct md_rdev *rrdev = rcu_dereference(
+                       conf->mirrors[i].replacement);
+               if (rdev == rrdev)
+                       rrdev = NULL;
+               if (rdev && unlikely(test_bit(Blocked, &rdev->flags))) {
+                       atomic_inc(&rdev->nr_pending);
+                       blocked_rdev = rdev;
+                       break;
+               }
+               if (rrdev && unlikely(test_bit(Blocked, &rrdev->flags))) {
+                       atomic_inc(&rrdev->nr_pending);
+                       blocked_rdev = rrdev;
+                       break;
+               }
+       }
+       rcu_read_unlock();
+
+       if (unlikely(blocked_rdev)) {
+               /* Have to wait for this device to get unblocked, then retry */
+               allow_barrier(conf);
+               raid10_log(conf->mddev, "%s wait rdev %d blocked",
+                               __func__, blocked_rdev->raid_disk);
+               md_wait_for_blocked_rdev(blocked_rdev, mddev);
+               wait_barrier(conf);
+               goto retry_wait;
We need to clear blocked_rdev before this goto, or put retry_wait label
before "blocked_rdev = NULL;". I guess this path is not tested...
I did test for this patch with mkfs with/without resync and wrote some files to device. And ran fstrim after writing some files. The patch worked well during the test.
For blocked device situation, I didn't test. I'll add this test.

We are duplicating a lot of logic from raid10_write_request() here. Can we
try to pull the common logic into a helper function?
I'll do this.

[...]

+static void raid10_end_discard_request(struct bio *bio)
+{
+       struct r10bio *r10_bio = bio->bi_private;
+       struct r10conf *conf = r10_bio->mddev->private;
+       struct md_rdev *rdev = NULL;
+       int dev;
+       int slot, repl;
+
+       /*
+        * We don't care the return value of discard bio
+        */
+       if (!test_bit(R10BIO_Uptodate, &r10_bio->state))
+               set_bit(R10BIO_Uptodate, &r10_bio->state);
+
+       dev = find_bio_disk(conf, r10_bio, bio, &slot, &repl);
+       if (repl)
+               rdev = conf->mirrors[dev].replacement;
+       if (!rdev) {
+               smp_rmb();
+               repl = 0;
+               rdev = conf->mirrors[dev].rdev;
+       }
+
+       if (atomic_dec_and_test(&r10_bio->remaining)) {
+               md_write_end(r10_bio->mddev);
+               raid_end_bio_io(r10_bio);
+       }
+
+       rdev_dec_pending(rdev, conf->mddev);
+}
+
+/* There are some limitations to handle discard bio
+ * 1st, the discard size is bigger than stripe_size*2.
+ * 2st, if the discard bio spans reshape progress, we use the old way to
+ * handle discard bio
+ */
+static bool raid10_handle_discard(struct mddev *mddev, struct bio *bio)
+{
+       struct r10conf *conf = mddev->private;
+       struct geom geo = conf->geo;
Do we really need a full copy of conf->geo?
It doesn't need a full copy. Thanks for pointing about this.

+       struct r10bio *r10_bio;
+
+       int disk;
+       sector_t chunk;
+       int stripe_size, stripe_mask;
+
+       sector_t bio_start, bio_end;
+       sector_t first_stripe_index, last_stripe_index;
+       sector_t start_disk_offset;
+       unsigned int start_disk_index;
+       sector_t end_disk_offset;
+       unsigned int end_disk_index;
+
+       wait_barrier(conf);
+
+       if (conf->reshape_progress != MaxSector &&
+           ((bio->bi_iter.bi_sector >= conf->reshape_progress) !=
+            conf->mddev->reshape_backwards))
+               geo = conf->prev;
+
+       stripe_size = (1<<geo.chunk_shift) * geo.raid_disks;
This could be raid_disks << chunk_shift
ok

+       stripe_mask = stripe_size - 1;
Does this work when raid_disks is not power of 2?
In test I used 5 disks to create the raid10 too, it worked well. Could you explain what
you worried in detail?

Regards
Xiao




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux