Hi everyone,
on my laptop, I am experiencing occasional hangs of applications during
fsync(), which are sometimes up to 30 seconds long. I'm using a BTRFS
which spans two partitions on the same SSD (one of them used to contain
a Windows, but I removed it and added the partition to the BTRFS volume
instead). Also, the problem only occurs when an I/O scheduler
(mq-deadline) is in use. I'm running kernel version 4.20.3.
From what I understand so far, what happens is that a sync request
fails in the SCSI/ATA layer, in ata_std_qc_defer(), because it is a
"Non-NCQ command" and can not be queued together with other commands.
This propagates up into blk_mq_dispatch_rq_list(), where the call
ret = q->mq_ops->queue_rq(hctx, &bd);
returns BLK_STS_DEV_RESOURCE. Later in blk_mq_dispatch_rq_list(), there
is the piece of code
needs_restart = blk_mq_sched_needs_restart(hctx);
if (!needs_restart ||
(no_tag && list_empty_careful(&hctx->dispatch_wait.entry)))
blk_mq_run_hw_queue(hctx, true);
else if (needs_restart && (ret == BLK_STS_RESOURCE))
blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY);
which restarts the queue after a delay if BLK_STS_RESOURCE was returned,
but somehow not for BLK_STS_DEV_RESOURCE. Instead, nothing happens and
fsync() seems to hang until some other process wants to do I/O.
So if I do
- else if (needs_restart && (ret == BLK_STS_RESOURCE))
+ else if (needs_restart && (ret == BLK_STS_RESOURCE || ret ==
BLK_STS_DEV_RESOURCE))
it fixes my problem. But was there a reason why BLK_STS_DEV_RESOURCE was
treated differently that BLK_STS_RESOURCE here?
In any case, it seems wrong to me that ret is used here at all, as it
just contains the return value of the last request in the list, and
whether we rerun the queue should probably not only depend on the last
request?
Can anyone of the experts tell me whether this makes sense or I got
something completely wrong?
Best,
Florian