Reproducible system lockup, extracting files into XFS on dm-raid5 on dm-integrity on HDD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I am experimenting with the use of dm-integrity underneath dm-raid,
to get around the problem where, if a RAID 1 or RAID 5 array is
inconsistent, you may not know which copy is the good one.  I have found
a reproducible hard lockup involving XFS, RAID 5 and dm-integrity.

My test array consists of three spinning HDDs (each 2 decimal
terabytes), each with dm-integrity laid directly onto the disk
(no partition table), using SHA-256 checksums.  On top of this is
an MD-RAID array (raid5), and on top of *that* is an ordinary XFS
filesystem.

Extracting a large tar archive (970 G) into the filesystem causes a hard
lockup -- the entire system becomes unresponsive -- after some tens of
gigabytes have been extracted.  I have reproduced the lockup using
kernel versions 6.6.21 and 6.9.3.  No error messages make it to the
console, but with 6.9.3 I was able to extract almost all of a lockdep
report from pstore.  I don't fully understand lockdep reports, but it
*looks* like it might be a livelock rather than a deadlock, with all
available kworker threads so bogged down with dm-integrity chores that
an XFS log flush is blocked long enough to hit the hung task timeout.

Attached are:

- what I have of the lockdep report (kernel 6.9.3) (only a couple
  of lines at the very top are missing)
- kernel .config (6.9.3, lockdep enabled)
- dmesg up till userspace starts (6.6.21, lockdep not enabled)
- details of the test array configuration

Please advise if there is any more information you need.  I am happy to
test patches.  I'm not subscribed to either dm-devel or linux-xfs.

zw

p.s. Incidentally, why doesn't the dm-integrity superblock record the
checksum algorithm in use?
# xfs_info /dev/md/sdr5p1
meta-data=/dev/md/sdr5p1         isize=512    agcount=32, agsize=30283904 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
data     =                       bsize=4096   blocks=969084928, imaxpct=5
         =                       sunit=128    swidth=256 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=473186, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

# gdisk -l /dev/md/sdr5
GPT fdisk (gdisk) version 1.0.9

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/md/sdr5: 969086976 sectors, 3.6 TiB
Sector size (logical/physical): 4096/4096 bytes
Disk identifier (GUID): 28F28613-3AAD-46F9-AABA-9CC7E7EFFC3D
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 5
First usable sector is 6, last usable sector is 969086970
Partitions will be aligned on 256-sector boundaries
Total free space is 501 sectors (2.0 MiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1             256       969086719   3.6 TiB     8300  test_array

# mdadm --detail /dev/md/sdr5
/dev/md/sdr5:
           Version : 1.2
     Creation Time : Fri May 10 21:13:54 2024
        Raid Level : raid5
        Array Size : 3876347904 (3.61 TiB 3.97 TB)
     Used Dev Size : 1938173952 (1848.39 GiB 1984.69 GB)
      Raid Devices : 3
     Total Devices : 3
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Wed Jun  5 14:09:18 2024
             State : clean
    Active Devices : 3
   Working Devices : 3
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : moxana:sdr5  (local to host moxana)
              UUID : 395e1626:1483f9f8:39d6e78c:af21beb1
            Events : 13781

    Number   Major   Minor   RaidDevice State
       0     253        3        0      active sync   /dev/dm-3
       1     253        4        1      active sync   /dev/dm-4
       3     253        5        2      active sync   /dev/dm-5

# for d in sda sdb sdc; do integritysetup dump /dev/$d; done
Info for integrity device /dev/sda.
superblock_version 5
log2_interleave_sectors 15
integrity_tag_size 32
journal_sections 496
provided_data_sectors 3876612136
sector_size 4096
log2_blocks_per_bitmap 12
flags fix_padding fix_hmac

Info for integrity device /dev/sdb.
superblock_version 5
log2_interleave_sectors 15
integrity_tag_size 32
journal_sections 496
provided_data_sectors 3876612136
sector_size 4096
log2_blocks_per_bitmap 12
flags fix_padding fix_hmac

Info for integrity device /dev/sdc.
superblock_version 5
log2_interleave_sectors 15
integrity_tag_size 32
journal_sections 496
provided_data_sectors 3876612136
sector_size 4096
log2_blocks_per_bitmap 12
flags fix_padding fix_hmac
[ 2213.559141]       Not tainted 6.9.3-gentoo-lockdep #2
[ 2213.559146] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2213.559149] task:kworker/25:3    state:D stack:0     pid:13498 tgid:13498 ppid:2      flags:0x00004000
[ 2213.559160] Workqueue: xfs-sync/md126p1 xfs_log_worker
[ 2213.559169] Call Trace:
[ 2213.559172]  <TASK>
[ 2213.559177]  __schedule+0x49a/0x1900
[ 2213.559183]  ? find_held_lock+0x32/0x90
[ 2213.559190]  ? srso_return_thunk+0x5/0x5f
[ 2213.559198]  schedule+0x31/0x130
[ 2213.559204]  schedule_timeout+0x1cd/0x1e0
[ 2213.559212]  __wait_for_common+0xbc/0x1e0
[ 2213.559218]  ? usleep_range_state+0xc0/0xc0
[ 2213.559226]  __flush_workqueue+0x15f/0x470
[ 2213.559235]  ? __wait_for_common+0x4d/0x1e0
[ 2213.559242]  xlog_cil_push_now.isra.0+0x59/0xa0
[ 2213.559249]  xlog_cil_force_seq+0x7d/0x290
[ 2213.559257]  xfs_log_force+0x86/0x2d0
[ 2213.559263]  xfs_log_worker+0x36/0xd0
[ 2213.559270]  process_one_work+0x210/0x640
[ 2213.559279]  worker_thread+0x1c7/0x3c0
[ 2213.559287]  ? wq_sysfs_prep_attrs+0xa0/0xa0
[ 2213.559294]  kthread+0xd2/0x100
[ 2213.559301]  ? kthread_complete_and_exit+0x20/0x20
[ 2213.559309]  ret_from_fork+0x2b/0x40
[ 2213.559317]  ? kthread_complete_and_exit+0x20/0x20
[ 2213.559324]  ret_from_fork_asm+0x11/0x20
[ 2213.559332]  </TASK>
[ 2213.559361] Showing all locks held in the system:
[ 2213.559390] 2 locks held by kworker/u131:0/208:
[ 2213.559395]  #0: ffff9aa10ffe2d58 ((wq_completion)xfs-cil/md126p1){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.559421]  #1: ffffb848c08dbe58 ((work_completion)(&ctx->push_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.559446] 3 locks held by kworker/u130:13/223:
[ 2213.559451]  #0: ffff9aa7cc1f8158 ((wq_completion)writeback){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.559474]  #1: ffffb848c0953e58 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.559497]  #2: ffff9aa0c25400e8 (&type->s_umount_key#32){++++}-{3:3}, at: super_trylock_shared+0x11/0x50
[ 2213.559522] 1 lock held by khungtaskd/230:
[ 2213.559526]  #0: ffffffff89ec2e20 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x2c/0x1d0
[ 2213.559557] 1 lock held by usb-storage/414:
[ 2213.559561]  #0: ffff9aa0cb15ace8 (&us_interface_key[i]){+.+.}-{3:3}, at: usb_stor_control_thread+0x43/0x2d0
[ 2213.559591] 1 lock held by in:imklog/1997:
[ 2213.559595]  #0: ffff9aa0db2258d8 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x84/0xd0
[ 2213.559620] 2 locks held by kworker/u131:3/3226:
[ 2213.559624]  #0: ffff9aa10ffe2d58 ((wq_completion)xfs-cil/md126p1){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.559664]  #1: ffffb848c47a7e58 ((work_completion)(&ctx->push_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.559706] 2 locks held by tar/5845:
[ 2213.559710]  #0: ffff9aa0c2540420 (sb_writers#6){.+.+}-{0:0}, at: ksys_write+0x6c/0xf0
[ 2213.559732]  #1: ffff9aa0e16c3f58 (&sb->s_type->i_mutex_key#8){++++}-{3:3}, at: xfs_ilock+0x144/0x180
[ 2213.559789] 2 locks held by kworker/14:28/6524:
[ 2213.559793]  #0: ffff9aa0da45e758 ((wq_completion)dm-integrity-writer#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.559815]  #1: ffffb848c64e7e58 ((work_completion)(&ic->writer_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.559882] 2 locks held by kworker/12:45/8171:
[ 2213.559886]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.559908]  #1: ffffb848d6583e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.559949] 2 locks held by kworker/12:81/8479:
[ 2213.559953]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.559979]  #1: ffffb848d6ea3e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.560006] 2 locks held by kworker/12:98/8496:
[ 2213.560010]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.560036]  #1: ffffb848d6f2be58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.560062] 2 locks held by kworker/12:101/8499:
[ 2213.560067]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.560093]  #1: ffffb848d6f43e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.560118] 2 locks held by kworker/12:110/8508:
[ 2213.560122]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.560149]  #1: ffffb848d6f8be58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.560175] 2 locks held by kworker/12:111/8509:
[ 2213.560180]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.560206]  #1: ffffb848d6f93e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.560230] 2 locks held by kworker/12:112/8510:
[ 2213.560235]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.560261]  #1: ffffb848d6f9be58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.560307] 2 locks held by kworker/u131:5/9163:
[ 2213.560312]  #0: ffff9aa10ffe2d58 ((wq_completion)xfs-cil/md126p1){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.560335]  #1: ffffb848d8803e58 ((work_completion)(&ctx->push_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.560359] 2 locks held by kworker/u131:6/9166:
[ 2213.560364]  #0: ffff9aa10ffe2d58 ((wq_completion)xfs-cil/md126p1){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.560387]  #1: ffffb848c44c7e58 ((work_completion)(&ctx->push_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.560429] 2 locks held by kworker/30:236/9664:
[ 2213.560433]  #0: ffff9aa0e43c0b58 ((wq_completion)dm-integrity-writer#3){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.560455]  #1: ffffb848da42be58 ((work_completion)(&ic->writer_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.560540] 2 locks held by kworker/12:128/11574:
[ 2213.560544]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.560564]  #1: ffffb848de48be58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.648428] 2 locks held by kworker/12:175/11621:
[ 2213.648431]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.648443]  #1: ffffb848de603e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.651134] 2 locks held by kworker/12:177/11623:
[ 2213.651136]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.651147]  #1: ffffb848c4c47e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.651158] 2 locks held by kworker/12:179/11625:
[ 2213.651159]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.651170]  #1: ffffb848de613e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.651181] 2 locks held by kworker/12:180/11626:
[ 2213.651183]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.651193]  #1: ffffb848de61be58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.651205] 2 locks held by kworker/12:182/11628:
[ 2213.651206]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.651217]  #1: ffffb848de62be58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.651228] 2 locks held by kworker/12:184/11630:
[ 2213.651230]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.651240]  #1: ffffb848d4793e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.651257] 2 locks held by kworker/12:236/11682:
[ 2213.651259]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.651270]  #1: ffffb848de7cbe58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.651280] 2 locks held by kworker/12:239/11685:
[ 2213.651282]  #0: ffff9aa0da420358 ((wq_completion)dm-integrity-offload#5){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.651293]  #1: ffffb848de7e3e58 ((work_completion)(&dio->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.651341] 2 locks held by kworker/25:121/12751:
[ 2213.651343]  #0: ffff9aa0c8122f58 ((wq_completion)dm-integrity-writer#4){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.651353]  #1: ffffb848e0c13e58 ((work_completion)(&ic->writer_work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.651425] 2 locks held by kworker/25:3/13498:
[ 2213.651426]  #0: ffff9aa0c7bfe758 ((wq_completion)xfs-sync/md126p1){+.+.}-{0:0}, at: process_one_work+0x3cc/0x640
[ 2213.651436]  #1: ffffb848e259be58 ((work_completion)(&(&log->l_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1ca/0x640
[ 2213.651465] =============================================
[ 2213.651467] Kernel panic - not syncing: hung_task: blocked tasks
[ 2213.652654] Kernel Offset: 0x7000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Attachment: dmesg.txt.gz
Description: application/gzip

Attachment: kconfig-6.9.x-lockdep.gz
Description: application/gzip


[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux