Re: LVM kernel lockup scenario during lvcreate

Jaco Kroon <jaco@xxxxxxxxx> · Thu, 24 Aug 2023 09:29:02 +0200

Hi Bart,

Just a follow up on this.

It seems even with the "none" scheduler we had an occurrence of this 
now.  Unfortunately I could not get to the host quickly enough in order 
to confirm ongoing IO, although based on the activity LEDs there were 
disks with IO.  I believe the disk controller controls these LEDs, but 
I'm not sure the pattern used to switch them on/off and this could vary 
from controller to controller (ie, do they go off only once the host has 
confirmed receipt of data, or once the data has been sent to the host?). 
This does seem to support your theory of a controller firmware issue.

It definitely happens more often with mq-deadline compared to none.

We're definitely seeing the same thing on another host using an ahci 
controller.  This seems to hint that it's not a firmware issue, as does 
the fact that this happens much less frequently with the none scheduler.

I will make a plan to action the firmware updates on the raid controller 
over the weekend regardless, just in order to eliminate that.  I will 
then revert to mq-deadline.  Assuming this does NOT fix it, how would I 
go about assessing if this is a controller firmware issue or a Linux 
kernel issue?

Come to think of it, it may be related or not, we've long since switched 
off dmeventd as running dmeventd causes this to happen on all hosts the 
moments any form of snapshots are involved.  With dmeventd combined with 
"heavy" use of the lv commands we could pretty much guarantee some level 
of lockup within a couple of days.

Kind regards,
Jaco

On 2023/07/13 17:07, Jaco Kroon wrote:

Hi Bart,

Not familiar at all with fio, so hoping this was OK.

On 2023/07/12 15:43, Bart Van Assche wrote:
On 7/12/23 03:12, Jaco Kroon wrote:
Ideas/Suggestions?

How about manually increasing the workload, e.g. by using fio to 
randomly read 4 KiB fragments with a high queue depth?

Bart.

[global]
kb_base=1024
unit_base=8
loops=10000
runtime=7200
time_based=1
directory=/home/fio
nrfiles=1
size=4194304
iodepth=256
ioengine=io_uring
numjobs=512
create_fsync=1

[reader]

crowsnest [17:01:35] ~ # fio --alloc-size=$(( 32 * 1024 )) fio.ini

Load averag went up to 1200+, IO was consistently 1GB/s read 
throughput, and IOPs anywhere between 100k and 500k, mostly around the 
150k region.

Guessing the next step would be to restore mq-deadline as scheduler 
and re-do?

I've neglected to capture the output unfortunately, will do next run 
with --output if needed.  Can definitely initiate another run around 
6:00am GMT in the morning.

Kind Regards,
Jaco