Re: [PATCH] iomap: Address soft lockup in iomap_finish_ioend()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/30/21 2:25 PM, Trond Myklebust wrote:
> On Thu, 2021-12-30 at 13:24 -0800, Jens Axboe wrote:
>> On 12/30/21 11:35 AM, trondmy@xxxxxxxxxx wrote:
>>> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
>>>
>>> We're observing the following stack trace using various kernels
>>> when
>>> running in the Azure cloud.
>>>
>>>  watchdog: BUG: soft lockup - CPU#12 stuck for 23s!
>>> [kworker/12:1:3106]
>>>  Modules linked in: raid0 ipt_MASQUERADE nf_conntrack_netlink
>>> xt_addrtype nft_chain_nat nf_nat br_netfilter bridge stp llc ext4
>>> mbcache jbd2 overlay xt_conntrack nf_conntrack nf_defrag_ipv6
>>> nf_defrag_ipv4 nft_counter rpcrdma rdma_ucm xt_owner ib_srpt
>>> nft_compat intel_rapl_msr ib_isert intel_rapl_common nf_tables
>>> iscsi_target_mod isst_if_mbox_msr isst_if_common nfnetlink
>>> target_core_mod nfit ib_iser libnvdimm libiscsi
>>> scsi_transport_iscsi ib_umad kvm_intel ib_ipoib rdma_cm iw_cm vfat
>>> ib_cm fat kvm irqbypass crct10dif_pclmul crc32_pclmul mlx5_ib
>>> ghash_clmulni_intel rapl ib_uverbs ib_core i2c_piix4 pcspkr
>>> hyperv_fb hv_balloon hv_utils joydev nfsd auth_rpcgss nfs_acl lockd
>>> grace sunrpc ip_tables xfs libcrc32c mlx5_core mlxfw tls pci_hyperv
>>> pci_hyperv_intf sd_mod t10_pi sg ata_generic hv_storvsc hv_netvsc
>>> scsi_transport_fc hyperv_keyboard hid_hyperv ata_piix libata
>>> crc32c_intel hv_vmbus serio_raw fuse
>>>  CPU: 12 PID: 3106 Comm: kworker/12:1 Not tainted 4.18.0-
>>> 305.10.2.el8_4.x86_64 #1
>>>  Hardware name: Microsoft Corporation Virtual Machine/Virtual
>>> Machine, BIOS 090008  12/07/2018
>>>  Workqueue: xfs-conv/md127 xfs_end_io [xfs]
>>>  RIP: 0010:_raw_spin_unlock_irqrestore+0x11/0x20
>>>  Code: 7c ff 48 29 e8 4c 39 e0 76 cf 80 0b 08 eb 8c 90 90 90 90 90
>>> 90 90 90 90 90 0f 1f 44 00 00 e8 e6 db 7e ff 66 90 48 89 f7 57 9d
>>> <0f> 1f 44 00 00 c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 07
>>>  RSP: 0018:ffffac51d26dfd18 EFLAGS: 00000202 ORIG_RAX:
>>> ffffffffffffff12
>>>  RAX: 0000000000000001 RBX: ffffffff980085a0 RCX: dead000000000200
>>>  RDX: ffffac51d3893c40 RSI: 0000000000000202 RDI: 0000000000000202
>>>  RBP: 0000000000000202 R08: ffffac51d3893c40 R09: 0000000000000000
>>>  R10: 00000000000000b9 R11: 00000000000004b3 R12: 0000000000000a20
>>>  R13: ffffd228f3e5a200 R14: ffff963cf7f58d10 R15: ffffd228f3e5a200
>>>  FS:  0000000000000000(0000) GS:ffff9625bfb00000(0000)
>>> knlGS:0000000000000000
>>>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>  CR2: 00007f5035487500 CR3: 0000000432810004 CR4: 00000000003706e0
>>>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>  Call Trace:
>>>   wake_up_page_bit+0x8a/0x110
>>>   iomap_finish_ioend+0xd7/0x1c0
>>>   iomap_finish_ioends+0x7f/0xb0
>>>   xfs_end_ioend+0x6b/0x100 [xfs]
>>>   ? xfs_setfilesize_ioend+0x60/0x60 [xfs]
>>>   xfs_end_io+0xb9/0xe0 [xfs]
>>>   process_one_work+0x1a7/0x360
>>>   worker_thread+0x1fa/0x390
>>>   ? create_worker+0x1a0/0x1a0
>>>   kthread+0x116/0x130
>>>   ? kthread_flush_work_fn+0x10/0x10
>>>   ret_from_fork+0x35/0x40
>>>
>>> Jens suggested adding a latency-reducing cond_resched() to the loop
>>> in
>>> iomap_finish_ioends().
>>
>> The patch doesn't add it there though, I was suggesting:
>>
>> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
>> index 71a36ae120ee..4ad2436a936a 100644
>> --- a/fs/iomap/buffered-io.c
>> +++ b/fs/iomap/buffered-io.c
>> @@ -1078,6 +1078,7 @@ iomap_finish_ioends(struct iomap_ioend *ioend,
>> int error)
>>                 ioend = list_first_entry(&tmp, struct iomap_ioend,
>> io_list);
>>                 list_del_init(&ioend->io_list);
>>                 iomap_finish_ioend(ioend, error);
>> +               cond_resched();
>>         }
>>  }
>>  EXPORT_SYMBOL_GPL(iomap_finish_ioends);
>>
>> as I don't think you need it once-per-vec. But not sure if you tested
>> that variant or not...
>>
> 
> Yes, we did test that variant, but were still seeing the soft lockups
> on Azure, hence why I moved it into the inner loop.

Gotcha - but maybe just outside the vec loop then, after the bio_put()?
Once per vec seems excessive, each vec shouldn't take long, but I guess
the ioend inlines can be long?

-- 
Jens Axboe




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux