On 7/15/20 10:49 AM, Damien Le Moal wrote: > On 2020/07/15 17:18, Hannes Reinecke wrote: >> When a new buffer zone is allocated in dmz_handle_buffered_write() >> we should update the 'atime' to inform reclaim that this zone has >> been accessed. >> Otherwise we end up with the pathological case where the first write >> allocates a new buffer zone, but the next write will start reclaim >> before processing the bio. As the atime is not set reclaim declares >> the system idle and reclaims the zone. Then the write will be processed >> and re-allocate the very same zone again; this repeats for every >> consecutive write, making for a _very_ slow mkfs. >> >> Signed-off-by: Hannes Reinecke <hare@xxxxxxx> >> --- >> drivers/md/dm-zoned-target.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c >> index cf915009c306..b32d37bef14f 100644 >> --- a/drivers/md/dm-zoned-target.c >> +++ b/drivers/md/dm-zoned-target.c >> @@ -297,6 +297,9 @@ static int dmz_handle_buffered_write(struct dmz_target *dmz, >> if (dmz_is_readonly(bzone)) >> return -EROFS; >> >> + /* Tell reclaim we're doing some work here */ >> + dmz_reclaim_bio_acc(bzone->dev->reclaim); >> + >> /* Submit write */ >> ret = dmz_submit_bio(dmz, bzone, bio, chunk_block, nr_blocks); >> if (ret) > > This is without a cache device, right ? Otherwise, since the cache device has no > reclaim, it would not make much sense. > > In fact, I think that the atime timestamp being attached to each device reclaim > structure is a problem. We do not need that since we always trigger reclaim for > all drives. We only want to see if the target is busy or not, so atime should be > attached to struct dmz_metadata. > > That will simplify things since we will not need to care about which zone/which > device is being accessed to track activity. We can just have: > > dmz_reclaim_bio_acc(dmz->metadata); > > Thoughts ? > Well, I might be off the mark with this patch, but I did run into the the mentioned pathological behaviour; there was exactly _one_ zone cached, all I/O was going into that zone, and reclaim (seemed) to be busy with that very zone. The latter is actually conjecture, as I did _not_ get any messages from the reclaim on that device. I've seen idle messages from reclaim on the other devices, but reclaim from one device was suspiciously silent. And I/O went through, but _dead_ slow. All writes, mind (that was during mkfs time), so I gathered it might be due to the atime accounting not being done correctly. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@xxxxxxx +49 911 74053 688 SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), GF: Felix Imendörffer -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel