Re: LVM kernel lockup scenario during lvcreate

Jaco Kroon <jaco@xxxxxxxxx> · Tue, 11 Jul 2023 15:22:17 +0200

Hi Bart,

On 2023/06/27 01:29, Jaco Kroon wrote:
Hi Bart,

On 2023/06/26 18:42, Bart Van Assche wrote:
On 6/26/23 01:30, Jaco Kroon wrote:
Please find attached updated ps and dmesg too, diskinfo wasn't 
regenerated but this doesn't generally change.

Not sure how this all works, according to what I can see the only 
disk with pending activity (queued) is sdw?  Yet, a large number of 
processes is blocking on IO, and yet again the stack traces in dmesg 
points at __schedule.  For a change we do not have lvcreate in the 
process list! This time that particular script's got a fsck in 
uninterruptable wait ...

Hi Jaco,

I see pending commands for five different SCSI disks:

$ zgrep /busy: block.gz
./sdh/hctx0/busy:00000000affe2ba0 {.op=WRITE, .cmd_flags=SYNC|FUA, 
.rq_flags=FLUSH_SEQ|MQ_INFLIGHT|DONTPREP|IO_STAT|ELV, 
.state=in_flight, .tag=2055, .internal_tag=214, .cmd=Write(16) 8a 08 
00 00 00 01 d1 c0 b7 88 00 00 00 08 00 00, .retries=0, .result = 0x0, 
.flags=TAGGED|INITIALIZED|LAST, .timeout=180.000, allocated 0.000 s ago}
./sda/hctx0/busy:00000000987bb7c7 {.op=WRITE, .cmd_flags=SYNC|FUA, 
.rq_flags=FLUSH_SEQ|MQ_INFLIGHT|DONTPREP|IO_STAT|ELV, 
.state=in_flight, .tag=2050, .internal_tag=167, .cmd=Write(16) 8a 08 
00 00 00 01 d1 c0 b7 88 00 00 00 08 00 00, .retries=0, .result = 0x0, 
.flags=TAGGED|INITIALIZED|LAST, .timeout=180.000, allocated 0.010 s ago}
./sdw/hctx0/busy:00000000aec61b17 {.op=READ, .cmd_flags=META|PRIO, 
.rq_flags=STARTED|MQ_INFLIGHT|DONTPREP|ELVPRIV|IO_STAT|ELV, 
.state=in_flight, .tag=2056, .internal_tag=8, .cmd=Read(16) 88 00 00 
00 00 00 00 1c 01 a8 00 00 00 08 00 00, .retries=0, .result = 0x0, 
.flags=TAGGED|INITIALIZED|LAST, .timeout=180.000, allocated 0.000 s ago}
./sdw/hctx0/busy:0000000087e9a58e {.op=WRITE, .cmd_flags=SYNC|FUA, 
.rq_flags=FLUSH_SEQ|MQ_INFLIGHT|DONTPREP|IO_STAT|ELV, 
.state=in_flight, .tag=2058, .internal_tag=102, .cmd=Write(16) 8a 08 
00 00 00 01 d1 c0 b7 88 00 00 00 08 00 00, .retries=0, .result = 0x0, 
.flags=TAGGED|INITIALIZED|LAST, .timeout=180.000, allocated 0.000 s ago}
./sdaf/hctx0/busy:00000000d8751601 {.op=WRITE, .cmd_flags=SYNC|FUA, 
.rq_flags=FLUSH_SEQ|MQ_INFLIGHT|DONTPREP|IO_STAT|ELV, 
.state=in_flight, .tag=2057, .internal_tag=51, .cmd=Write(16) 8a 08 
00 00 00 01 d1 c0 b7 88 00 00 00 08 00 00, .retries=0, .result = 0x0, 
.flags=TAGGED|INITIALIZED|LAST, .timeout=180.000, allocated 0.010 s ago}

All requests have the flag "ELV". So my follow-up questions are:
* Which I/O scheduler has been configured? If it is BFQ, please try 
whether
  mq-deadline or "none" work better.

crowsnest [00:34:58] /sys/class/block/sda/device/block/sda/queue # cat 
scheduler
none [mq-deadline] kyber bfq

crowsnest [00:35:31] /sys/class/block # for i in 
*/device/block/sd*/queue/scheduler; do echo none > $i; done
crowsnest [00:35:45] /sys/class/block #

So let's see if that perhaps relates.  Neither CFQ nor BFQ has ever 
given me anywhere near the performance of deadline, so that's our 
default goto.

crowsnest [15:03:34] ~ # uptime
 15:07:52 up 15 days,  4:47,  3 users,  load average: 10.26, 9.88, 9.68

"how long is a piece of string?" comes to mind whether looking to decide 
if that's sufficiently long to call it success?  Normally died after 
about 7 days.

So *suspected* mq-deadline bug?

And various hints that newer firmware exists ... but broadcom is not 
making it easy to find the download nor is supermicro's website of 
much help ... will try again during more sane hours.

https://www.broadcom.com/support/download-search?pg=Storage+Adapters,+Controllers,+and+ICs&pf=Storage+Adapters,+Controllers,+and+ICs&pn=SAS3008+I/O+Controller&pa=Firmware&po=&dk=&pl=&l=false 

Supermicro has responded with appropriate upgrade options which we've 
not executed yet, but I'll make time for that over the coming weekend, 
or perhaps I should wait a bit longer to give more time for a similar 
lockup with the none scheduler?

Kind Regards,
Jaco