On 2/14/18 8:39 AM, Paolo Valente wrote: > > >> Il giorno 14 feb 2018, alle ore 16:19, Jens Axboe <axboe@xxxxxxxxx> ha scritto: >> >> On 2/14/18 1:56 AM, Paolo Valente wrote: >>> >>> >>>> Il giorno 14 feb 2018, alle ore 08:15, Mike Galbraith <efault@xxxxxx> ha scritto: >>>> >>>> On Wed, 2018-02-14 at 08:04 +0100, Mike Galbraith wrote: >>>>> >>>>> And _of course_, roughly two minutes later, IO stalled. >>>> >>>> P.S. >>>> >>>> crash> bt 19117 >>>> PID: 19117 TASK: ffff8803d2dcd280 CPU: 7 COMMAND: "kworker/7:2" >>>> #0 [ffff8803f7207bb8] __schedule at ffffffff81595e18 >>>> #1 [ffff8803f7207c40] schedule at ffffffff81596422 >>>> #2 [ffff8803f7207c50] io_schedule at ffffffff8108a832 >>>> #3 [ffff8803f7207c60] blk_mq_get_tag at ffffffff8129cd1e >>>> #4 [ffff8803f7207cc0] blk_mq_get_request at ffffffff812987cc >>>> #5 [ffff8803f7207d00] blk_mq_alloc_request at ffffffff81298a9a >>>> #6 [ffff8803f7207d38] blk_get_request_flags at ffffffff8128e674 >>>> #7 [ffff8803f7207d60] scsi_execute at ffffffffa0025b58 [scsi_mod] >>>> #8 [ffff8803f7207d98] scsi_test_unit_ready at ffffffffa002611c [scsi_mod] >>>> #9 [ffff8803f7207df8] sd_check_events at ffffffffa0212747 [sd_mod] >>>> #10 [ffff8803f7207e20] disk_check_events at ffffffff812a0f85 >>>> #11 [ffff8803f7207e78] process_one_work at ffffffff81079867 >>>> #12 [ffff8803f7207eb8] worker_thread at ffffffff8107a127 >>>> #13 [ffff8803f7207f10] kthread at ffffffff8107ef48 >>>> #14 [ffff8803f7207f50] ret_from_fork at ffffffff816001a5 >>>> crash> >>> >>> This has evidently to do with tag pressure. I've looked for a way to >>> easily reduce the number of tags online, so as to put your system in >>> the bad spot deterministically. But at no avail. Does anyone know a >>> way to do it? >> >> The key here might be that it's not a regular file system request, >> which I'm sure bfq probably handles differently. So it's possible >> that you are slowly leaking those tags, and we end up in this >> miserable situation after a while. >> > > Could you elaborate more on this? My mental model of bfq hooks in > this respect is that they do only side operations, which AFAIK cannot > block the putting of a tag. IOW, tag getting and putting is done > outside bfq, regardless of what bfq does with I/O requests. Is there > a flaw in this? > > In any case, is there any flag in or the like, in requests passed to > bfq, that I could make bfq check, to raise some warning? I'm completely guessing, and I don't know if this trace is always what Mike sees when things hang. It just seems suspect that we end up with a "special" request here, since I'm sure the regular file system requests outnumber them greatly. That raises my suspicion that the type is related. But no, there should be no special handling on the freeing side, my guess was that BFQ ends them a bit differently. Mike, when you see a hang like that, would it be possible for you to dump the contents of /sys/kernel/debug/block/<dev in question/* for us to inspect? That will tell us a lot about the internal state at that time. -- Jens Axboe