Re: WARNING: CPU: 5 PID: 25802 at fs/xfs/libxfs/xfs_bmap.c:4530 xfs_bmapi_convert_delalloc+0x434/0x4a0 [xfs]

Arkadiusz Miśkiewicz <a.miskiewicz@xxxxxxxxx> · Tue, 12 Nov 2019 22:22:57 +0100

On 09/11/2019 23:38, Dave Chinner wrote:
> On Fri, Nov 08, 2019 at 07:01:15AM +0100, Arkadiusz Miśkiewicz wrote:
>>
>> Hello.
>>
>> I have two servers:
>>
>> backup4 - one with Adaptec ASR8885Q (that's the one which breaks so
>> often but this time adaptes works)
>>
>> backup3 - other with software raid only
>>
>> Both are now running 5.3.8 kernels and both end up like this log below.
>> It takes ~ up to day to reproduce.
>>
>> Here are more complete logs and kernel configs:
>>
>> https://ixion.pld-linux.org/~arekm/p2/xfs/
>>
>> Any ideas what is happening and what can I do to help debug the problem?
>>
>>> Nov  8 00:55:19 backup4 kernel: WARNING: CPU: 5 PID: 25802 at fs/xfs/libxfs/xfs_bmap.c:4530 xfs_bmapi_convert_delalloc+0x434/0x4a0 [xfs]
>>> Nov  8 00:55:19 backup4 kernel: Modules linked in: nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_sfq nfnetlink_log nfnetlink xt_NFLOG xt_comment xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables x_tables bpfilter xfs mlx4_ib ib_uverbs ib_core mlx4_en ses enclosure scsi_transport_sas joydev input_leds hid_generic usbhid hid coretemp intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm iTCO_wdt iTCO_vendor_support mxm_wmi ipmi_ssif irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper xhci_pci mlx4_core intel_cstate xhci_hcd intel_uncore ehci_pci ehci_hcd igb intel_rapl_perf pcspkr ipmi_si usbcore i2c_i801 mei_me ioatdma acpi_power_meter aacraid i2c_algo_bit ipmi_devintf mei dca lpc_ich i2c_core evdev ipmi_msghandler wmi hwmon acpi_pad button sch_fq_codel ext4 libcrc32c crc32c_generic crc32c_intel crc16 mbcache jbd2 sd_mod raid1 md_mod ahci libahci libata
>>> Nov  8 00:55:19 backup4 kernel:  scsi_mod
>>> Nov  8 00:55:19 backup4 kernel: CPU: 5 PID: 25802 Comm: kworker/u65:3 Tainted: G                T 5.3.8-1 #1
>>> Nov  8 00:55:19 backup4 kernel: Hardware name: Supermicro X10DRi/X10DRi, BIOS 3.0a 02/06/2018
>>> Nov  8 00:55:19 backup4 kernel: Workqueue: writeback wb_workfn (flush-8:48)
>>> Nov  8 00:55:19 backup4 kernel: RIP: 0010:xfs_bmapi_convert_delalloc+0x434/0x4a0 [xfs]
>>> Nov  8 00:55:19 backup4 kernel: Code: b7 c0 83 c0 01 e9 9c fd ff ff 41 80 bc 24 e9 00 00 00 03 49 8d 44 24 48 74 d9 c7 84 24 c0 00 00 00 01 00 00 00 e9 94 fd ff ff <0f> 0b 41 be e4 ff ff ff 48 8d 7c 24 30 44 89 f2 44 89 ee e8 84 6f
>>> Nov  8 00:55:19 backup4 kernel: RSP: 0018:ffffac6346a5b838 EFLAGS: 00010246
>>> Nov  8 00:55:19 backup4 kernel: RAX: 0000000000000000 RBX: ffff95c8f972a000 RCX: 0000000000000022
>>> Nov  8 00:55:19 backup4 kernel: RDX: 0000000000001fda RSI: ffffffffffffffff RDI: ffff95c8f93b0600
>>> Nov  8 00:55:19 backup4 kernel: RBP: ffffac6346a5b938 R08: 0000000000000000 R09: 0000000000000001
>>> Nov  8 00:55:19 backup4 kernel: R10: ffff95c33e6f41d8 R11: 0000000000000026 R12: ffff95be37512a80
>>> Nov  8 00:55:19 backup4 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff95be37512ac8
>>> Nov  8 00:55:19 backup4 kernel: FS:  0000000000000000(0000) GS:ffff95c8ff940000(0000) knlGS:0000000000000000
>>> Nov  8 00:55:19 backup4 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> Nov  8 00:55:19 backup4 kernel: CR2: 00007ff938389000 CR3: 000000048720a001 CR4: 00000000003606e0
>>> Nov  8 00:55:19 backup4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> Nov  8 00:55:19 backup4 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>> Nov  8 00:55:19 backup4 kernel: Call Trace:
>>> Nov  8 00:55:19 backup4 kernel:  xfs_map_blocks+0x18e/0x420 [xfs]
>>> Nov  8 00:55:19 backup4 kernel:  xfs_do_writepage+0x11c/0x440 [xfs]
>>> Nov  8 00:55:19 backup4 kernel:  write_cache_pages+0x185/0x430
>>> Nov  8 00:55:19 backup4 kernel:  ? xfs_vm_writepages+0x90/0x90 [xfs]
>>> Nov  8 00:55:19 backup4 kernel:  xfs_vm_writepages+0x5e/0x90 [xfs]
>>> Nov  8 00:55:19 backup4 kernel:  do_writepages+0x1c/0x60
>>> Nov  8 00:55:19 backup4 kernel:  __writeback_single_inode+0x41/0x360
>>> Nov  8 00:55:19 backup4 kernel:  writeback_sb_inodes+0x20c/0x490
>>> Nov  8 00:55:19 backup4 kernel:  wb_writeback+0x12a/0x320
>>> Nov  8 00:55:19 backup4 kernel:  wb_workfn+0xdd/0x4a0
>>> Nov  8 00:55:19 backup4 kernel:  process_one_work+0x1d5/0x370
>>> Nov  8 00:55:19 backup4 kernel:  worker_thread+0x4d/0x3d0
>>> Nov  8 00:55:19 backup4 kernel:  kthread+0xfb/0x140
>>> Nov  8 00:55:19 backup4 kernel:  ? process_one_work+0x370/0x370
>>> Nov  8 00:55:19 backup4 kernel:  ? kthread_park+0x80/0x80
>>> Nov  8 00:55:19 backup4 kernel:  ret_from_fork+0x35/0x40
>>> Nov  8 00:55:19 backup4 kernel: ---[ end trace a08de0c8c2851936 ]---
>>> Nov  8 00:55:19 backup4 kernel: XFS (sdd1): page discard on page 00000000e56e0621, inode 0x7a17f4206, offset 10354688.
> 
> This goes with what you reported on #xfs:
> 
> [9/11/19 08:04] <arekm> [   54.777715] Filesystem "sdd1": reserve blocks depleted! Consider increasing reserve pool size.
> 
> And:
> 
> [9/11/19 08:16] <arekm> ah, yes [   54.777720] XFS (sdd1): Per-AG reservation for AG 38 failed.  Filesystem may run out of space

A note - I got these messages only once on backup4. Didn't find such
messages in old logs (on both machines).

> which implies that we are regularly under-estimating delayed
> allocation metadata reservations, or a large delalloc extent is
> being split repeatedly on writeback. e.g. severe file fragmentation
> due to operating at/near ENOSPC resulting in a 100MB delalloc extent
> being split into individual single block allocations.
> 
> Eventually this will deplete the reserve pool and trigger ENOSPC,
> which will eventually result in a dirty allocation transaction being
> cancelled and shutdown occurring.

Did anything important in that area change between 4.20 (where I had no
problems) and 5.1 (first occurrence of assert hit on both machines) ? I
see xfs_bmapi_convert_delalloc related changes in git log back then.

Currently I'm back on 4.19.82 on both machines for 4 days (on 5.3.x 1
day was enough to end up with fs shutdown) and no assert hits.

[resend due to bad vger postmasters]
-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )