I am experimenting with the use of dm-integrity underneath dm-raid, to get around the problem where, if a RAID 1 or RAID 5 array is inconsistent, you may not know which copy is the good one. I have found a reproducible hard lockup involving XFS, RAID 5 and dm-integrity. My test array consists of three spinning HDDs (each 2 decimal terabytes), each with dm-integrity laid directly onto the disk (no partition table), using SHA-256 checksums. On top of this is an MD-RAID array (raid5), and on top of *that* is an ordinary XFS filesystem. Extracting a large tar archive (970 G) into the filesystem causes a hard lockup -- the entire system becomes unresponsive -- after some tens of gigabytes have been extracted. I have reproduced the lockup using kernel versions 6.6.21 and 6.9.3. No error messages make it to the console, but with 6.9.3 I was able to extract almost all of a lockdep report from pstore. I don't fully understand lockdep reports, but it *looks* like it might be a livelock rather than a deadlock, with all available kworker threads so bogged down with dm-integrity chores that an XFS log flush is blocked long enough to hit the hung task timeout. Attached are: - what I have of the lockdep report (kernel 6.9.3) (only a couple of lines at the very top are missing) - kernel .config (6.9.3, lockdep enabled) - dmesg up till userspace starts (6.6.21, lockdep not enabled) - details of the test array configuration Please advise if there is any more information you need. I am happy to test patches. I'm not subscribed to either dm-devel or linux-xfs. zw p.s. Incidentally, why doesn't the dm-integrity superblock record the checksum algorithm in use?
# xfs_info /dev/md/sdr5p1 meta-data=/dev/md/sdr5p1 isize=512 agcount=32, agsize=30283904 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=1 = reflink=1 bigtime=1 inobtcount=1 nrext64=1 data = bsize=4096 blocks=969084928, imaxpct=5 = sunit=128 swidth=256 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=473186, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 # gdisk -l /dev/md/sdr5 GPT fdisk (gdisk) version 1.0.9 Partition table scan: MBR: protective BSD: not present APM: not present GPT: present Found valid GPT with protective MBR; using GPT. Disk /dev/md/sdr5: 969086976 sectors, 3.6 TiB Sector size (logical/physical): 4096/4096 bytes Disk identifier (GUID): 28F28613-3AAD-46F9-AABA-9CC7E7EFFC3D Partition table holds up to 128 entries Main partition table begins at sector 2 and ends at sector 5 First usable sector is 6, last usable sector is 969086970 Partitions will be aligned on 256-sector boundaries Total free space is 501 sectors (2.0 MiB) Number Start (sector) End (sector) Size Code Name 1 256 969086719 3.6 TiB 8300 test_array # mdadm --detail /dev/md/sdr5 /dev/md/sdr5: Version : 1.2 Creation Time : Fri May 10 21:13:54 2024 Raid Level : raid5 Array Size : 3876347904 (3.61 TiB 3.97 TB) Used Dev Size : 1938173952 (1848.39 GiB 1984.69 GB) Raid Devices : 3 Total Devices : 3 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Wed Jun 5 14:09:18 2024 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K Consistency Policy : bitmap Name : moxana:sdr5 (local to host moxana) UUID : 395e1626:1483f9f8:39d6e78c:af21beb1 Events : 13781 Number Major Minor RaidDevice State 0 253 3 0 active sync /dev/dm-3 1 253 4 1 active sync /dev/dm-4 3 253 5 2 active sync /dev/dm-5 # for d in sda sdb sdc; do integritysetup dump /dev/$d; done Info for integrity device /dev/sda. superblock_version 5 log2_interleave_sectors 15 integrity_tag_size 32 journal_sections 496 provided_data_sectors 3876612136 sector_size 4096 log2_blocks_per_bitmap 12 flags fix_padding fix_hmac Info for integrity device /dev/sdb. superblock_version 5 log2_interleave_sectors 15 integrity_tag_size 32 journal_sections 496 provided_data_sectors 3876612136 sector_size 4096 log2_blocks_per_bitmap 12 flags fix_padding fix_hmac Info for integrity device /dev/sdc. superblock_version 5 log2_interleave_sectors 15 integrity_tag_size 32 journal_sections 496 provided_data_sectors 3876612136 sector_size 4096 log2_blocks_per_bitmap 12 flags fix_padding fix_hmac
[ 2213.559141] Not tainted 6.9.3-gentoo-lockdep #2 [ 2213.559146] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2213.559149] task:kworker/25:3 state:D stack:0 pid:13498 tgid:13498 ppid:2 flags:0x00004000 [ 2213.559160] Workqueue: xfs-sync/md126p1 xfs_log_worker [ 2213.559169] Call Trace: [ 2213.559172] <TASK> [ 2213.559177] __schedule+0x49a/0x1900 [ 2213.559183] ? find_held_lock+0x32/0x90 [ 2213.559190] ? srso_return_thunk+0x5/0x5f [ 2213.559198] schedule+0x31/0x130 [ 2213.559204] schedule_timeout+0x1cd/0x1e0 [ 2213.559212] __wait_for_common+0xbc/0x1e0 [ 2213.559218] ? usleep_range_state+0xc0/0xc0 [ 2213.559226] __flush_workqueue+0x15f/0x470 [ 2213.559235] ? __wait_for_common+0x4d/0x1e0 [ 2213.559242] xlog_cil_push_now.isra.0+0x59/0xa0 [ 2213.559249] xlog_cil_force_seq+0x7d/0x290 [ 2213.559257] xfs_log_force+0x86/0x2d0 [ 2213.559263] xfs_log_worker+0x36/0xd0 [ 2213.559270] process_one_work+0x210/0x640 [ 2213.559279] worker_thread+0x1c7/0x3c0 [ 2213.559287] ? wq_sysfs_prep_attrs+0xa0/0xa0 [ 2213.559294] kthread+0xd2/0x100 [ 2213.559301] ? kthread_complete_and_exit+0x20/0x20 [ 2213.559309] ret_from_fork+0x2b/0x40 [ 2213.559317] ? kthread_complete_and_exit+0x20/0x20 [ 2213.559324] ret_from_fork_asm+0x11/0x20 [ 2213.559332] </TASK> [ 2213.559361] Showing all locks held in the system: [ 2213.559390] 2 locks held by kworker/u131:0/208: [ 2213.559395] #0: ffff9aa10ffe2d58 ((wq_completion)xfs-cil/md126p1){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.559421] #1: ffffb848c08dbe58 ((work_completion)(&ctx->push_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.559446] 3 locks held by kworker/u130:13/223: [ 2213.559451] #0: ffff9aa7cc1f8158 ((wq_completion)writeback){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.559474] #1: ffffb848c0953e58 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.559497] #2: ffff9aa0c25400e8 (&type->s_umount_key#32){++++}-{3:3}, at: super_trylock_shared+0x11/0x50 [ 2213.559522] 1 lock held by khungtaskd/230: [ 2213.559526] #0: ffffffff89ec2e20 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x2c/0x1d0 [ 2213.559557] 1 lock held by usb-storage/414: [ 2213.559561] #0: ffff9aa0cb15ace8 (&us_interface_key[i]){+.+.}-{3:3}, at: usb_stor_control_thread+0x43/0x2d0 [ 2213.559591] 1 lock held by in:imklog/1997: [ 2213.559595] #0: ffff9aa0db2258d8 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x84/0xd0 [ 2213.559620] 2 locks held by kworker/u131:3/3226: [ 2213.559624] #0: ffff9aa10ffe2d58 ((wq_completion)xfs-cil/md126p1){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.559664] #1: ffffb848c47a7e58 ((work_completion)(&ctx->push_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.559706] 2 locks held by tar/5845: [ 2213.559710] #0: ffff9aa0c2540420 (sb_writers#6){.+.+}-{0:0}, at: ksys_write+0x6c/0xf0 [ 2213.559732] #1: ffff9aa0e16c3f58 (&sb->s_type->i_mutex_key#8){++++}-{3:3}, at: xfs_ilock+0x144/0x180 [ 2213.559789] 2 locks held by kworker/14:28/6524: [ 2213.559793] #0: ffff9aa0da45e758 ((wq_completion)dm-integrity-writer#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.559815] #1: ffffb848c64e7e58 ((work_completion)(&ic->writer_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.559882] 2 locks held by kworker/12:45/8171: [ 2213.559886] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.559908] #1: ffffb848d6583e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.559949] 2 locks held by kworker/12:81/8479: [ 2213.559953] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.559979] #1: ffffb848d6ea3e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.560006] 2 locks held by kworker/12:98/8496: [ 2213.560010] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.560036] #1: ffffb848d6f2be58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.560062] 2 locks held by kworker/12:101/8499: [ 2213.560067] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.560093] #1: ffffb848d6f43e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.560118] 2 locks held by kworker/12:110/8508: [ 2213.560122] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.560149] #1: ffffb848d6f8be58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.560175] 2 locks held by kworker/12:111/8509: [ 2213.560180] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.560206] #1: ffffb848d6f93e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.560230] 2 locks held by kworker/12:112/8510: [ 2213.560235] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.560261] #1: ffffb848d6f9be58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.560307] 2 locks held by kworker/u131:5/9163: [ 2213.560312] #0: ffff9aa10ffe2d58 ((wq_completion)xfs-cil/md126p1){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.560335] #1: ffffb848d8803e58 ((work_completion)(&ctx->push_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.560359] 2 locks held by kworker/u131:6/9166: [ 2213.560364] #0: ffff9aa10ffe2d58 ((wq_completion)xfs-cil/md126p1){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.560387] #1: ffffb848c44c7e58 ((work_completion)(&ctx->push_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.560429] 2 locks held by kworker/30:236/9664: [ 2213.560433] #0: ffff9aa0e43c0b58 ((wq_completion)dm-integrity-writer#3){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.560455] #1: ffffb848da42be58 ((work_completion)(&ic->writer_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.560540] 2 locks held by kworker/12:128/11574: [ 2213.560544] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.560564] #1: ffffb848de48be58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.648428] 2 locks held by kworker/12:175/11621: [ 2213.648431] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.648443] #1: ffffb848de603e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.651134] 2 locks held by kworker/12:177/11623: [ 2213.651136] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.651147] #1: ffffb848c4c47e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.651158] 2 locks held by kworker/12:179/11625: [ 2213.651159] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.651170] #1: ffffb848de613e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.651181] 2 locks held by kworker/12:180/11626: [ 2213.651183] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.651193] #1: ffffb848de61be58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.651205] 2 locks held by kworker/12:182/11628: [ 2213.651206] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.651217] #1: ffffb848de62be58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.651228] 2 locks held by kworker/12:184/11630: [ 2213.651230] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.651240] #1: ffffb848d4793e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.651257] 2 locks held by kworker/12:236/11682: [ 2213.651259] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.651270] #1: ffffb848de7cbe58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.651280] 2 locks held by kworker/12:239/11685: [ 2213.651282] #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.651293] #1: ffffb848de7e3e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.651341] 2 locks held by kworker/25:121/12751: [ 2213.651343] #0: ffff9aa0c8122f58 ((wq_completion)dm-integrity-writer#4){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.651353] #1: ffffb848e0c13e58 ((work_completion)(&ic->writer_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.651425] 2 locks held by kworker/25:3/13498: [ 2213.651426] #0: ffff9aa0c7bfe758 ((wq_completion)xfs-sync/md126p1){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640 [ 2213.651436] #1: ffffb848e259be58 ((work_completion)(&(&log->l_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640 [ 2213.651465] ============================================= [ 2213.651467] Kernel panic - not syncing: hung_task: blocked tasks [ 2213.652654] Kernel Offset: 0x7000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Attachment:
dmesg.txt.gz
Description: application/gzip
Attachment:
kconfig-6.9.x-lockdep.gz
Description: application/gzip