Re: WARNING: CPU: 5 PID: 25802 at fs/xfs/libxfs/xfs_bmap.c:4530 xfs_bmapi_convert_delalloc+0x434/0x4a0 [xfs]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27/11/2019 16:43, Christoph Hellwig wrote:
> On Fri, Nov 08, 2019 at 08:36:31AM +0100, Arkadiusz Miśkiewicz wrote:
>>> The WARN_ON means that conversion of delalloc blocks failed to find
>>> free space.  Something that should not be possible due to the delalloc
>>> reservations.  What as the last kernel where you did not see something
>>> like this?
>>>
>>
>> 4.20.13 looks ok
>> 5.1.15 bad (that warn_on triggered)
>>
>> on both machines according to my old logs
> 
> Sorry for dropping the ball.  We have some pretty significant changes
> between those version as 5.3 has a pretty big rewrite of the high level
> writeback code, and this happens in code called from that. 

Hm, 5.3 but I saw this on 5.1.15, too. See below. (or did you mean 5.1
was with big changes?)

For now I'm on 4.19.XX for over 2 weeks without this issue, on both servers.

> Oct 29 19:04:47 backup3 kernel: [1573435.718506][ T6332] WARNING: CPU: 2 PID: 6332 at fs/xfs/libxfs/xfs_bmap.c:4513 xfs_bmapi_convert_delalloc+0x44d/0x4b0 [xfs]
> Oct 29 19:04:47 backup3 kernel: [1573435.718507][ T6332] Modules linked in: nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_sfq bonding nf_log_ipv4 nf_log_common xt_LOG xt_comment xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables x_tables bpfilter ext4 crc16 mbcache jbd2 input_leds joydev hid_generic usbhid hid coretemp intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel raid456 kvm async_raid6_recov irqbypass async_memcpy igb async_pq async_xor async_tx xor cmdlinepart intel_spi_platform intel_spi spi_nor mtd iTCO_wdt iTCO_vendor_support evdev crct10dif_pclmul crc32_pclmul ghash_clmulni_intel raid1 aesni_intel aes_x86_64 crypto_simd cryptd raid6_pq glue_helper hwmon dca intel_cstate pcspkr i2c_algo_bit intel_uncore i2c_i801 intel_pch_thermal intel_rapl_perf lpc_ich ipmi_ssif i2c_core xhci_pci xhci_hcd ses enclosure ehci_pci ehci_hcd usbcore ipmi_si ipmi_devintf ipmi_msghandler fan thermal button video ie31200_edac pcc_cpufreq sch_fq_codel xfs libcrc32c crc32c_generic
> Oct 29 19:04:47 backup3 kernel: [1573435.718530][ T6332]  crc32c_intel sd_mod mpt3sas raid_class scsi_transport_sas raid10 md_mod ahci libahci libata scsi_mod
> Oct 29 19:04:47 backup3 kernel: [1573435.718536][ T6332] CPU: 2 PID: 6332 Comm: kworker/u8:9 Tainted: G                T 5.1.15-1 #1
> Oct 29 19:04:47 backup3 kernel: [1573435.718537][ T6332] Hardware name: Supermicro X10SL7-F/X10SL7-F, BIOS 2.00 04/24/2014
> Oct 29 19:04:47 backup3 kernel: [1573435.718540][ T6332] Workqueue: writeback wb_workfn (flush-9:5)
> Oct 29 19:04:47 backup3 kernel: [1573435.718555][ T6332] RIP: 0010:xfs_bmapi_convert_delalloc+0x44d/0x4b0 [xfs]
> Oct 29 19:04:47 backup3 kernel: [1573435.718556][ T6332] Code: 8b 43 38 48 89 44 24 10 e9 3e fc ff ff 80 bb e1 00 00 00 03 48 8d 43 48 74 ce c7 84 24 c8 00 00 00 01 00 00 00 e9 85 fd ff ff <0f> 0b 41 bf e4 ff ff ff 4c 89 e7 44 89 fa 44 89 f6 e8 2d 71 ff ff
> Oct 29 19:04:47 backup3 kernel: [1573435.718556][ T6332] RSP: 0018:ffffb8eb4384b858 EFLAGS: 00010246
> Oct 29 19:04:47 backup3 kernel: [1573435.718557][ T6332] RAX: 0000000000000000 RBX: ffff8d81fa9261c0 RCX: 0000000000000000
> Oct 29 19:04:47 backup3 kernel: [1573435.718558][ T6332] RDX: 0000000000001f8e RSI: ffffffffffffffff RDI: ffff8d83e8d66600
> Oct 29 19:04:47 backup3 kernel: [1573435.718558][ T6332] RBP: ffffb8eb4384b960 R08: 0000000000000000 R09: 000000000001bd06
> Oct 29 19:04:47 backup3 kernel: [1573435.718558][ T6332] R10: ffff8d7f5fafaaa8 R11: 0000000000000001 R12: ffffb8eb4384b890
> Oct 29 19:04:47 backup3 kernel: [1573435.718559][ T6332] R13: ffff8d83dfe8e000 R14: 0000000000000000 R15: 0000000000000000
> Oct 29 19:04:47 backup3 kernel: [1573435.718560][ T6332] FS:  0000000000000000(0000) GS:ffff8d841fd00000(0000) knlGS:0000000000000000
> Oct 29 19:04:47 backup3 kernel: [1573435.718560][ T6332] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Oct 29 19:04:47 backup3 kernel: [1573435.718561][ T6332] CR2: 00000000021b3fc0 CR3: 000000061020e005 CR4: 00000000001606e0
> Oct 29 19:04:47 backup3 kernel: [1573435.718561][ T6332] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Oct 29 19:04:47 backup3 kernel: [1573435.718562][ T6332] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Oct 29 19:04:47 backup3 kernel: [1573435.718562][ T6332] Call Trace:
> Oct 29 19:04:47 backup3 kernel: [1573435.718582][ T6332]  xfs_map_blocks+0x1ae/0x450 [xfs]
> Oct 29 19:04:47 backup3 kernel: [1573435.718599][ T6332]  xfs_do_writepage+0x114/0x420 [xfs]
> Oct 29 19:04:47 backup3 kernel: [1573435.718603][ T6332]  write_cache_pages+0x1bd/0x460
> Oct 29 19:04:47 backup3 kernel: [1573435.718620][ T6332]  ? xfs_vm_writepages+0x90/0x90 [xfs]
> Oct 29 19:04:47 backup3 kernel: [1573435.718637][ T6332]  xfs_vm_writepages+0x5b/0x90 [xfs]
> Oct 29 19:04:47 backup3 kernel: [1573435.718639][ T6332]  do_writepages+0x1c/0x60
> Oct 29 19:04:47 backup3 kernel: [1573435.718641][ T6332]  __writeback_single_inode+0x41/0x340
> Oct 29 19:04:47 backup3 kernel: [1573435.718644][ T6332]  ? __wake_up_common_lock+0x88/0xc0
> Oct 29 19:04:47 backup3 kernel: [1573435.718645][ T6332]  writeback_sb_inodes+0x1dd/0x460
> Oct 29 19:04:47 backup3 kernel: [1573435.718646][ T6332]  wb_writeback+0x12a/0x310
> Oct 29 19:04:47 backup3 kernel: [1573435.718648][ T6332]  wb_workfn+0xde/0x460
> Oct 29 19:04:47 backup3 kernel: [1573435.718650][ T6332]  process_one_work+0x1fb/0x3f0
> Oct 29 19:04:47 backup3 kernel: [1573435.718651][ T6332]  worker_thread+0x246/0x3d0
> Oct 29 19:04:47 backup3 kernel: [1573435.718652][ T6332]  ? process_one_work+0x3f0/0x3f0
> Oct 29 19:04:47 backup3 kernel: [1573435.718654][ T6332]  kthread+0x113/0x130
> Oct 29 19:04:47 backup3 kernel: [1573435.718655][ T6332]  ? kthread_park+0x80/0x80
> Oct 29 19:04:47 backup3 kernel: [1573435.718658][ T6332]  ret_from_fork+0x35/0x40
> Oct 29 19:04:47 backup3 kernel: [1573435.718659][ T6332] ---[ end trace 37cbd88cae263afa ]---
> Oct 29 19:04:47 backup3 kernel: [1573435.718663][ T6332] XFS (md5): page discard on page 000000007bcb59fa, inode 0x101f3041f, offset 234819584.
> Oct 29 19:04:47 backup3 kernel: [1573435.718685][ T6332] XFS (md5): page discard on page 00000000b1b6fee1, inode 0x101f3041f, offset 234823680.
> Oct 29 19:04:47 backup3 kernel: [1573435.718703][ T6332] XFS (md5): page discard on page 00000000f7525550, inode 0x101f3041f, offset 234827776.


> But it
> doesn't actually change (or should change) how we use the reservations,
> which leaves me a little puzzled.  If I gave you an instrumented kernel
> with new trace points, could you provide traces based on that?

Probably it will be easier to just bisect and I plan to do that after
backup catches up with missing data (and it takes days).


-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux