Re: LVM kernel lockup scenario during lvcreate

Bart Van Assche <bvanassche@xxxxxxx> · Mon, 26 Jun 2023 09:42:08 -0700

On 6/26/23 01:30, Jaco Kroon wrote:
Please find attached updated ps and dmesg too, diskinfo wasn't regenerated but this doesn't generally change.

Not sure how this all works, according to what I can see the only disk with pending activity (queued) is sdw?  Yet, a large number of processes is blocking on IO, and yet again the stack traces in dmesg points at __schedule.  For a change we do not have lvcreate in the 
process list! This time that particular script's got a fsck in uninterruptable wait ...

Hi Jaco,

I see pending commands for five different SCSI disks:

$ zgrep /busy: block.gz
./sdh/hctx0/busy:00000000affe2ba0 {.op=WRITE, .cmd_flags=SYNC|FUA, .rq_flags=FLUSH_SEQ|MQ_INFLIGHT|DONTPREP|IO_STAT|ELV, .state=in_flight, .tag=2055, .internal_tag=214, .cmd=Write(16) 8a 08 00 00 00 01 d1 c0 b7 88 00 00 00 08 00 00, .retries=0, .result = 0x0, 
.flags=TAGGED|INITIALIZED|LAST, .timeout=180.000, allocated 0.000 s ago}
./sda/hctx0/busy:00000000987bb7c7 {.op=WRITE, .cmd_flags=SYNC|FUA, .rq_flags=FLUSH_SEQ|MQ_INFLIGHT|DONTPREP|IO_STAT|ELV, .state=in_flight, .tag=2050, .internal_tag=167, .cmd=Write(16) 8a 08 00 00 00 01 d1 c0 b7 88 00 00 00 08 00 00, .retries=0, .result = 0x0, 
.flags=TAGGED|INITIALIZED|LAST, .timeout=180.000, allocated 0.010 s ago}
./sdw/hctx0/busy:00000000aec61b17 {.op=READ, .cmd_flags=META|PRIO, .rq_flags=STARTED|MQ_INFLIGHT|DONTPREP|ELVPRIV|IO_STAT|ELV, .state=in_flight, .tag=2056, .internal_tag=8, .cmd=Read(16) 88 00 00 00 00 00 00 1c 01 a8 00 00 00 08 00 00, .retries=0, .result = 0x0, 
.flags=TAGGED|INITIALIZED|LAST, .timeout=180.000, allocated 0.000 s ago}
./sdw/hctx0/busy:0000000087e9a58e {.op=WRITE, .cmd_flags=SYNC|FUA, .rq_flags=FLUSH_SEQ|MQ_INFLIGHT|DONTPREP|IO_STAT|ELV, .state=in_flight, .tag=2058, .internal_tag=102, .cmd=Write(16) 8a 08 00 00 00 01 d1 c0 b7 88 00 00 00 08 00 00, .retries=0, .result = 0x0, 
.flags=TAGGED|INITIALIZED|LAST, .timeout=180.000, allocated 0.000 s ago}
./sdaf/hctx0/busy:00000000d8751601 {.op=WRITE, .cmd_flags=SYNC|FUA, .rq_flags=FLUSH_SEQ|MQ_INFLIGHT|DONTPREP|IO_STAT|ELV, .state=in_flight, .tag=2057, .internal_tag=51, .cmd=Write(16) 8a 08 00 00 00 01 d1 c0 b7 88 00 00 00 08 00 00, .retries=0, .result = 0x0, 
.flags=TAGGED|INITIALIZED|LAST, .timeout=180.000, allocated 0.010 s ago}

All requests have the flag "ELV". So my follow-up questions are:
* Which I/O scheduler has been configured? If it is BFQ, please try whether
  mq-deadline or "none" work better.
* Have any of the cgroup I/O controllers been activated?
* Are the disks directly connected to the motherboard of the server or are
  the disks perhaps controlled by a HBA? If so, which HBA? There are multiple
  lines in dmesg that start with "mpt3sas". Is the firmware of this HBA
  up-to-date?

Thanks,

Bart.