Re: [PATCH 2/2] nvme-multipath: don't block on blk_queue_enter of the underlying device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/23/21 8:31 AM, Sagi Grimberg wrote:

Actually, I had been playing around with marking the entire bio as 'NOWAIT'; that would avoid the tag stall, too:

@@ -313,7 +316,7 @@ blk_qc_t nvme_ns_head_submit_bio(struct bio *bio)
         ns = nvme_find_path(head);
         if (likely(ns)) {
                 bio_set_dev(bio, ns->disk->part0);
-               bio->bi_opf |= REQ_NVME_MPATH;
+               bio->bi_opf |= REQ_NVME_MPATH | REQ_NOWAIT;
                 trace_block_bio_remap(bio, disk_devt(ns->head->disk),
                                       bio->bi_iter.bi_sector);
                 ret = submit_bio_noacct(bio);


My only worry here is that we might incur spurious failures under high load; but then this is not necessarily a bad thing.

What? making spurious failures is not ok under any load. what fs will
take into account that you may have run out of tags?

Well, it's not actually a spurious failure but rather a spurious failover, as we're still on a multipath scenario, and bios will still be re-routed to other paths. Or queued if all paths are out of tags.
Hence the OS would not see any difference in behaviour.

But in the end, we abandoned this attempt, as the crash we've been seeing was in bio_endio (due to bi_bdev still pointing to the removed path device):

[ 6552.155251]  bio_endio+0x74/0x120
[ 6552.155260]  nvme_ns_head_submit_bio+0x36f/0x3e0 [nvme_core]
[ 6552.155271]  submit_bio_noacct+0x175/0x490
[ 6552.155284]  ? nvme_requeue_work+0x5a/0x70 [nvme_core]
[ 6552.155290]  nvme_requeue_work+0x5a/0x70 [nvme_core]
[ 6552.155296]  process_one_work+0x1f4/0x3e0
[ 6552.155299]  worker_thread+0x2d/0x3e0
[ 6552.155302]  ? process_one_work+0x3e0/0x3e0
[ 6552.155305]  kthread+0x10d/0x130
[ 6552.155307]  ? kthread_park+0xa0/0xa0
[ 6552.155311]  ret_from_fork+0x35/0x40

So we're not blocked on blk_queue_enter(), and it's a crash, not a deadlock. Blocking on blk_queue_enter() certainly plays a part here,
but is seems not to be the full picture.

Cheers,

Hannes
--
Dr. Hannes Reinecke                Kernel Storage Architect
hare@xxxxxxx                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux