Re: cephfs kernel 5.10.78 client crashes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





Thank you very much for the hints, I did not suspect it two happen with 512GB memory, anyway it was doomed to happen I suppose.

I have tuned the parameters as you have suggested, the default min_free of 90MB is rather low, also 5s writeback can quickly reach few 100GB with many streams.
I tried it with elrepo 5.15.5 but the machine also hanged with no tuning.

Will report how it goes.

Thanks,
Andrej

On 29/11/2021 19:52, Jeff Layton wrote:
On Fri, 2021-11-26 at 09:11 +0100, Andrej Filipcic wrote:
Hi,

we are doing some extensive stress testing of cephfs client throughput.
Ceph is 16.2.6, and we have seen no issues on the ceph side. The client
specs:
- kernel 5.10.78
- RHEL8.4
- 512GB memory, dual 32-core cpu
- bonded dual 100Gb Mellanox ConnectX-5 card with OFED drivers
- cephfs mount options: reltime, acl, nowsync
- tcp tuning with 256MB max window, bbr congestion control

The client can handle 3.5GB/s sustained writes of several parallel
(10-20) large streams (1-10GB) to EC 16+3 pools, but after several
hours, kernel panics appeared (1st log), and one mds hung (2nd log).
That happend on 3 clients we were testing Such crashes appear only on
heavilly loaded clients, while on moderate load (~1GB/s) it almost never
happens. The stress test  calls fdatasync after each file write.

Any ideas what is wrong here? ceph kernel bug or some client
misconfiguration?

Best regards,
Andrej

The panic:

2021-11-25 22:53:33 [ 8335.433436] P2P Transfer - : page allocation
failure: order:4, mode:0x40c40(GFP_NOFS|__GFP_COMP),
nodemask=(null),cpuset=/,mems_allowed=0-1
2021-11-25 22:53:33 [ 8335.445969] CPU: 17 PID: 142288 Comm: P2P
Transfer -  Tainted: G           O      5.10.78-2.el8.x86_64 #1
Hand built kernel, I take it? You may want to try the latest mainline
kernels, but it probably won't make a big difference here.

2021-11-25 22:53:33 [ 8335.455532] Hardware name: BULL
R282-Z90-00/MZ92-FS0-00, BIOS R16 07/10/2020
2021-11-25 22:53:33 [ 8335.462577] Call Trace:
2021-11-25 22:53:33 [ 8335.465044]  dump_stack+0x6d/0x88
2021-11-25 22:53:33 [ 8335.468358]  warn_alloc.cold.125+0x7b/0xdd
2021-11-25 22:53:33 [ 8335.472449]  ? _cond_resched+0x15/0x30
2021-11-25 22:53:33 [ 8335.476203]  ?
__alloc_pages_direct_compact+0x12f/0x140
2021-11-25 22:53:33 [ 8335.481428]
   __alloc_pages_slowpath.constprop.115+0xbcd/0xc00
2021-11-25 22:53:33 [ 8335.487185]  ? send_request+0x833/0xb20 [libceph]
2021-11-25 22:53:33 [ 8335.491889]  __alloc_pages_nodemask+0x2cc/0x300
2021-11-25 22:53:33 [ 8335.496420]  kmalloc_order+0x24/0xf0
2021-11-25 22:53:33 [ 8335.500001]  kmalloc_order_trace+0x19/0x80
2021-11-25 22:53:33 [ 8335.504109]  ceph_writepages_start+0x80e/0x1400
[ceph]
2021-11-25 22:53:33 [ 8335.509251]  do_writepages+0x41/0xd0
2021-11-25 22:53:33 [ 8335.512827]  ? __ip_queue_xmit+0x15c/0x3e0
2021-11-25 22:53:33 [ 8335.516929]  __filemap_fdatawrite_range+0xc7/0x100
2021-11-25 22:53:33 [ 8335.521720]  file_write_and_wait_range+0x5e/0xb0
2021-11-25 22:53:33 [ 8335.526345]  ceph_fsync+0x4c/0x470 [ceph]
2021-11-25 22:53:33 [ 8335.530352]  do_fsync+0x38/0x70
2021-11-25 22:53:33 [ 8335.533497]  __x64_sys_fdatasync+0x13/0x20
2021-11-25 22:53:33 [ 8335.537599]  do_syscall_64+0x33/0x40
2021-11-25 22:53:33 [ 8335.541177]
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
2021-11-25 22:53:33 [ 8335.546229] RIP: 0033:0x7fcfb725e55f
2021-11-25 22:53:33 [ 8335.549810] Code: 00 0f 05 48 3d 00 f0 ff ff 77
40 c3 0f 1f 80 00 00 00 00 53 89 fb 48 83 ec 10 e8 4c c4 f8 ff 89 df 89
c2 b8 4b 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 89 d7 89 44 24 0c e8
8e c4 f8 ff 8b 44 24

Ouch. Failed allocation while trying to write back memory to handle an
fsync.

It needed memory to write back the pages and the only place to get was
by writing back pages. This is sort of the writeback nightmare scenario
as it usually leads to a hard lockup on the box once nothing can get
memory, and you can't recover from it.

The usual method to deal with this is to prevent it from happening in
the first place by tuning writeback behavior. You may want to play with
the /proc/sys/vm/dirty_* tunables:

     https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html

In particular, you want to aim to start writeback of dirty data in the
background sooner, so (e.g.) lowering the dirty_background_ratio to 5
and dirty_ratio to 10 may help. Also consider setting the
dirty_writeback_centisecs to a lower value to make pages eligible for
writeout sooner.

Long term, it would be nice to get as much allocation out of the
writeback path as we can, but unfortunately that's not a simple thing to
change after the fact.

2021-11-25 22:53:33 [ 8335.568554] RSP: 002b:00007fce05ddb790 EFLAGS:
00000293 ORIG_RAX: 000000000000004b
2021-11-25 22:53:33 [ 8335.576120] RAX: ffffffffffffffda RBX:
0000000000000c4c RCX: 00007fcfb725e55f
2021-11-25 22:53:33 [ 8335.583251] RDX: 0000000000000000 RSI:
00007fce05ddb7d0 RDI: 0000000000000c4c
2021-11-25 22:53:33 [ 8335.590375] RBP: 00007fce05ddb7c0 R08:
0000000451655248 R09: 0000000451655160
2021-11-25 22:53:33 [ 8335.597499] R10: 0000000000001a7a R11:
0000000000000293 R12: 0000000000000000
2021-11-25 22:53:33 [ 8335.604623] R13: 0000000800145668 R14:
00007fce05ddb800 R15: 00007fcda0019000
2021-11-25 22:53:33 [ 8335.611758] Mem-Info:
2021-11-25 22:53:33 [ 8335.614064] active_anon:993 inactive_anon:9552068
isolated_anon:0
2021-11-25 22:53:33 [ 8335.614064]  active_file:1717317
inactive_file:114691936 isolated_file:0
2021-11-25 22:53:33 [ 8335.614064]  unevictable:0 dirty:10350536
writeback:545812
2021-11-25 22:53:33 [ 8335.614064]  slab_reclaimable:2036011
slab_unreclaimable:365944
2021-11-25 22:53:33 [ 8335.614064]  mapped:21846 shmem:2368
pagetables:26525 bounce:0
2021-11-25 22:53:33 [ 8335.614064]  free:1568016 free_pcp:5646 free_cma:0
2021-11-25 22:53:33 [ 8335.648764] Node 0 active_anon:3416kB
inactive_anon:24035756kB active_file:4089036kB inactive_file:226502860kB
unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:79528kB
dirty:4849284kB writeback:25168kB shmem:8932kB shmem_thp: 0kB
shmem_pmdmapped: 0kB anon_thp: 20529152kB writeback_tmp:0kB
kernel_stack:40800kB all_unreclaimable? no
2021-11-25 22:53:33 [ 8335.679048] Node 1 active_anon:556kB
inactive_anon:14172516kB active_file:2780232kB inactive_file:232247192kB
unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:7856kB
dirty:36318296kB writeback:2365276kB shmem:540kB shmem_thp: 0kB
shmem_pmdmapped: 0kB anon_thp: 10524672kB writeback_tmp:0kB
kernel_stack:30048kB all_unreclaimable? no
2021-11-25 22:53:33 [ 8335.709344] Node 0 DMA free:11264kB min:0kB
low:12kB high:24kB reserved_highatomic:0KB active_anon:0kB
inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB
writepending:0kB present:15996kB managed:15360kB mlocked:0kB
pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
2021-11-25 22:53:33 [ 8335.735551] lowmem_reserve[]: 0 2104 257776
257776 257776
2021-11-25 22:53:33 [ 8335.740968] Node 0 DMA32 free:1023336kB min:368kB
low:2488kB high:4608kB reserved_highatomic:0KB active_anon:0kB
inactive_anon:1104112kB active_file:456kB inactive_file:40kB
unevictable:0kB writepending:0kB present:2221440kB managed:2154840kB
mlocked:0kB pagetables:4kB bounce:0kB free_pcp:0kB local_pcp:0kB
free_cma:0kB
2021-11-25 22:53:33 [ 8335.769173] lowmem_reserve[]: 0 0 255672 255672
255672
2021-11-25 22:53:33 [ 8335.774330] Node 0 Normal free:3252448kB
min:45564kB low:307364kB high:569164kB reserved_highatomic:2048KB
active_anon:3416kB inactive_anon:22931620kB active_file:4088580kB
inactive_file:226540064kB unevictable:0kB writepending:4872056kB
present:266061824kB managed:261808384kB mlocked:0kB pagetables:61080kB
bounce:0kB free_pcp:18444kB local_pcp:1376kB free_c
ma:0kB
2021-11-25 22:53:33 [ 8335.806515] lowmem_reserve[]: 0 0 0 0 0
2021-11-25 22:53:33 [ 8335.810369] Node 1 Normal free:2297488kB
min:107424kB low:371624kB high:635824kB reserved_highatomic:2048KB
active_anon:556kB inactive_anon:14172532kB active_file:2780284kB
inactive_file:231750564kB unevictable:0kB writepending:38331120kB
present:268433408kB managed:264200892kB mlocked:0kB pagetables:45020kB
bounce:0kB free_pcp:15540kB local_pcp:0kB free_cma
:0kB
2021-11-25 22:53:33 [ 8335.842381] lowmem_reserve[]: 0 0 0 0 0
2021-11-25 22:53:33 [ 8335.846225] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB
0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 2*4096kB (M) =
11264kB
2021-11-25 22:53:33 [ 8335.857874] Node 0 DMA32: 54*4kB (UM) 58*8kB
(UME) 60*16kB (UME) 121*32kB (UME) 107*64kB (UME) 79*128kB (UME)
54*256kB (ME) 56*512kB (UME) 50*1024kB (ME) 69*2048kB (UM) 187*4096kB
(UM) = 1023432kB
2021-11-25 22:53:33 [ 8335.875329] Node 0 Normal: 551149*4kB (UME)
104631*8kB (UME) 11220*16kB (UME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
0*1024kB 1*2048kB (H) 0*4096kB = 3223212kB
2021-11-25 22:53:33 [ 8335.889493] Node 1 Normal: 11*4kB (M) 53961*8kB
(UME) 109253*16kB (UME) 1*32kB (H) 1*64kB (H) 1*128kB (H) 1*256kB (H)
1*512kB (H) 1*1024kB (H) 0*2048kB 0*4096kB = 2181796kB
So here it shows that there is
2021-11-25 22:53:34 [ 8335.904877] Node 0 hugepages_total=0
hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
2021-11-25 22:53:34 [ 8335.913577] Node 0 hugepages_total=0
hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
2021-11-25 22:53:34 [ 8335.922013] Node 1 hugepages_total=0
hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
2021-11-25 22:53:34 [ 8335.930711] Node 1 hugepages_total=0
hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
2021-11-25 22:53:34 [ 8335.939140] 116162432 total pagecache pages
2021-11-25 22:53:34 [ 8335.943330] 36 pages in swap cache
2021-11-25 22:53:34 [ 8335.946445] general protection fault, probably
for non-canonical address 0x473ea8b1095ffa30: 0000 [#1] SMP NOPTI
2021-11-25 22:53:34 [ 8335.946737] Swap cache stats: add 533, delete
497, find 46/164
2021-11-25 22:53:34 [ 8335.956894] CPU: 102 PID: 94937 Comm:
kworker/102:5 Tainted: G           O      5.10.78-2.el8.x86_64 #1
2021-11-25 22:53:34 [ 8335.956896] Hardware name: BULL
R282-Z90-00/MZ92-FS0-00, BIOS R16 07/10/2020
2021-11-25 22:53:34 [ 8335.956911] Workqueue: ceph-msgr ceph_con_workfn
[libceph]
2021-11-25 22:53:34 [ 8335.962750] Free swap  = 4182012kB
2021-11-25 22:53:34 [ 8335.972190] RIP:
0010:writepages_finish+0x224/0x450 [ceph]
2021-11-25 22:53:34 [ 8335.972195] Code: 54 24 0c 85 d2 74 68 4c 89 ff
e8 e7 b2 56 f5 48 3b 1c 24 74 75 49 8b 45 08 4c 8b 3c 18 48 83 c3 08 4d
85 ff 0f 84 f8 00 00 00 <49> 8b 57 08 48 8d 42 ff 83 e2 01 49 0f 44 c7
48 8b 00 a8 04 0f 84
2021-11-25 22:53:34 [ 8335.979244] Total swap = 4189180kB
2021-11-25 22:53:34 [ 8335.984715] RSP: 0018:ffffb88718217cc0 EFLAGS:
00010206
2021-11-25 22:53:34 [ 8335.984717] RAX: ffff988ee8e18000 RBX:
0000000000008008 RCX: 0000000000031489
2021-11-25 22:53:34 [ 8335.984719] RDX: ffffe09cb0ce9b87 RSI:
0000000000000000 RDI: ffffe09cb0ce9bc0
2021-11-25 22:53:34 [ 8335.984720] RBP: ffff988e061f2710 R08:
0000000000031468 R09: 0000000000000020
2021-11-25 22:53:34 [ 8335.984720] R10: 00000000000000d7 R11:
0000000000000001 R12: ffff98bc2d211430
2021-11-25 22:53:34 [ 8335.984721] R13: ffff988ea9769ad8 R14:
ffff988e061f26c0 R15: 473ea8b1095ffa30
2021-11-25 22:53:34 [ 8335.984722] FS:  0000000000000000(0000)
GS:ffff98ccee580000(0000) knlGS:0000000000000000
2021-11-25 22:53:34 [ 8335.984723] CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
2021-11-25 22:53:34 [ 8335.984727] CR2: 00007fcf5006aac8 CR3:
000000016ad60000 CR4: 0000000000350ee0
2021-11-25 22:53:34 [ 8335.988134] 134183167 pages RAM
2021-11-25 22:53:34 [ 8335.993607] Call Trace:
2021-11-25 22:53:34 [ 8335.993620]  __complete_request+0x22/0x70 [libceph]
2021-11-25 22:53:34 [ 8335.993630]  dispatch+0x15e/0xb40 [libceph]
2021-11-25 22:53:34 [ 8336.012367] 0 pages HighMem/MovableOnly
2021-11-25 22:53:34 [ 8336.015766]  ?
ceph_x_check_message_signature+0x54/0xc0 [libceph]
2021-11-25 22:53:34 [ 8336.015774]  ? read_partial_message+0x214/0x770
[libceph]
2021-11-25 22:53:34 [ 8336.020998] 2138298 pages reserved
2021-11-25 22:53:34 [ 8336.028131]  try_read+0x77a/0x1190 [libceph]
2021-11-25 22:53:34 [ 8336.028139]  ceph_con_workfn+0x10f/0x610 [libceph]
2021-11-25 22:53:34 [ 8336.028145]  ? __schedule+0x299/0x7a0
2021-11-25 22:53:34 [ 8336.035276] 0 pages hwpoisoned
2021-11-25 22:53:34 [ 8336.042401]  process_one_work+0x1fb/0x390
2021-11-25 22:53:34 [ 8336.042403]  worker_thread+0x2d/0x3e0
2021-11-25 22:53:34 [ 8336.042405]  ? process_one_work+0x390/0x390
2021-11-25 22:53:34 [ 8336.042406]  kthread+0x116/0x130
2021-11-25 22:53:34 [ 8336.042408]  ? kthread_park+0x80/0x80
2021-11-25 22:53:34 [ 8336.042414]  ret_from_fork+0x22/0x30
2021-11-25 22:53:34 [ 8336.149055] Modules linked in: binfmt_misc ceph
xfs rbd libceph dns_resolver 8021q garp mrp stp llc bonding
nft_reject_inet nf_reject_ipv4 sch_fq nf_reject_ipv6 nft_reject rfkill
nft_limit rdma_ucm(O) rdma_cm(O) iw_cm(O) nft_ct nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables ib_ipoib(O) libcrc32c ib_cm(O)
nfnetlink ib_umad(O) sunrpc vfat fat ipmi_ssif a
md64_edac_mod edac_mce_amd amd_energy kvm_amd kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl pcspkr ast
drm_vram_helper drm_ttm_helper ttm drm_kms_helper drm syscopyarea
sysfillrect sp5100_tco sysimgblt acpi_ipmi fb_sys_fops ccp i2c_piix4
k10temp ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq tcp_bbr
ip_tables ext4 mbcache jbd2 mlx5_ib(O) ib_uverbs(O) ib_c
ore(O) raid1 sd_mod t10_pi sg crc32c_intel mlx5_core(O) mlxfw(O)
pci_hyperv_intf tls ahci igb libahci psample mlxdevm(O) i2c_algo_bit
auxiliary(O) libata dca mlx_compat(O) pinctrl_amd
2021-11-25 22:53:34 [ 8336.229967] general protection fault, probably
for non-canonical address 0x790fa7461edf3e0a: 0000 [#2] SMP NOPTI
2021-11-25 22:53:34 [ 8336.229999] ---[ end trace aa094bc887e83e0e ]---
2021-11-25 22:53:34 [ 8336.240134] CPU: 60 PID: 110782 Comm:
kworker/60:5 Tainted: G      D    O      5.10.78-2.el8.x86_64 #1
2021-11-25 22:53:34 [ 8336.240136] Hardware name: BULL
R282-Z90-00/MZ92-FS0-00, BIOS R16 07/10/2020
2021-11-25 22:53:34 [ 8336.240200] Workqueue: ceph-msgr ceph_con_workfn
[libceph]
2021-11-25 22:53:34 [ 8336.240207] RIP: 0010:ceph_tcp_sendpage+0x5/0x80
[libceph]
2021-11-25 22:53:35 [ 8336.240209] Code: 00 00 00 e8 ed 89 8a f5 eb e2
0f 0b 0f 0b e8 e2 f7 ff ff be 02 00 00 00 e8 d8 89 8a f5 eb cd 66 0f 1f
44 00 00 0f 1f 44 00 00 <4c> 8b 4e 08 41 81 c8 40 40 00 00 49 8d 41 ff
41 83 e1 01 48 0f 44
2021-11-25 22:53:35 [ 8336.240210] RSP: 0018:ffffb886d94b7db0 EFLAGS:
00010206
2021-11-25 22:53:35 [ 8336.240211] RAX: 0000000000008000 RBX:
ffff988521b89268 RCX: 0000000000000d2c
2021-11-25 22:53:35 [ 8336.240212] RDX: 00000000000002d4 RSI:
790fa7461edf3e0a RDI: ffff984eb581d7c0
2021-11-25 22:53:35 [ 8336.240212] RBP: 0000000000028000 R08:
0000000000028000 R09: ffff9850980fdd0c
2021-11-25 22:53:35 [ 8336.240213] R10: 0000000000000000 R11:
0000000000000000 R12: ffff988521b892e8
2021-11-25 22:53:35 [ 8336.240213] R13: 790fa7461edf3e0a R14:
ffff988faa62b830 R15: 0000000000000000
2021-11-25 22:53:35 [ 8336.240214] FS:  0000000000000000(0000)
GS:ffff98ccee300000(0000) knlGS:0000000000000000
2021-11-25 22:53:35 [ 8336.240215] CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
2021-11-25 22:53:35 [ 8336.240215] CR2: 00007efb141b6000 CR3:
000000016ad60000 CR4: 0000000000350ee0
2021-11-25 22:53:35 [ 8336.240216] Call Trace:
2021-11-25 22:53:35 [ 8336.240225]  try_write+0x129/0xb90 [libceph]
2021-11-25 22:53:35 [ 8336.240232]  ? try_read+0x36c/0x1190 [libceph]
2021-11-25 22:53:35 [ 8336.240237]  ? set_next_entity+0xa6/0x1d0
2021-11-25 22:53:35 [ 8336.240243]  ceph_con_workfn+0x32d/0x610 [libceph]
2021-11-25 22:53:35 [ 8336.240247]  ? __schedule+0x299/0x7a0
2021-11-25 22:53:35 [ 8336.240249]  process_one_work+0x1fb/0x390
2021-11-25 22:53:35 [ 8336.240253]  worker_thread+0x2d/0x3e0
2021-11-25 22:53:35 [ 8336.325025] RIP:
0010:writepages_finish+0x224/0x450 [ceph]
2021-11-25 22:53:35 [ 8336.329880]  ? process_one_work+0x390/0x390
2021-11-25 22:53:35 [ 8336.329882]  kthread+0x116/0x130
2021-11-25 22:53:35 [ 8336.329884]  ? kthread_park+0x80/0x80
2021-11-25 22:53:35 [ 8336.329888]  ret_from_fork+0x22/0x30
2021-11-25 22:53:35 [ 8336.329890] Modules linked in: binfmt_misc ceph
xfs rbd libceph dns_resolver 8021q garp mrp stp llc bonding
nft_reject_inet nf_reject_ipv4 sch_fq nf_reject_ipv6 nft_reject rfkill
nft_limit rdma_ucm(O) rdma_cm(O) iw_cm(O) nft_ct nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables ib_ipoib(O) libcrc32c ib_cm(O)
nfnetlink ib_umad(O) sunrpc vfat fat ipmi_ssif a
md64_edac_mod edac_mce_amd amd_energy kvm_amd kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl pcspkr ast
drm_vram_helper drm_ttm_helper ttm drm_kms_helper drm syscopyarea
sysfillrect sp5100_tco sysimgblt acpi_ipmi fb_sys_fops ccp i2c_piix4
k10temp ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq tcp_bbr
ip_tables ext4 mbcache jbd2 mlx5_ib(O) ib_uverbs(O) ib_c
ore(O) raid1 sd_mod t10_pi sg crc32c_intel mlx5_core(O) mlxfw(O)
pci_hyperv_intf
2021-11-25 22:53:35 [ 8336.337031] Code: 54 24 0c 85 d2 74 68 4c 89 ff
e8 e7 b2 56 f5 48 3b 1c 24 74 75 49 8b 45 08 4c 8b 3c 18 48 83 c3 08 4d
85 ff 0f 84 f8 00 00 00 <49> 8b 57 08 48 8d 42 ff 83 e2 01 49 0f 44 c7
48 8b 00 a8 04 0f 84
2021-11-25 22:53:35 [ 8336.342469]  tls ahci igb libahci psample
mlxdevm(O) i2c_algo_bit auxiliary(O) libata dca mlx_compat(O) pinctrl_amd
2021-11-25 22:53:35 [ 8336.342647] ---[ end trace aa094bc887e83e0f ]---
2021-11-25 22:53:35 [ 8336.348078] RSP: 0018:ffffb88718217cc0 EFLAGS:
00010206
2021-11-25 22:53:35 [ 8336.348080] RAX: ffff988ee8e18000 RBX:
0000000000008008 RCX: 0000000000031489
2021-11-25 22:53:35 [ 8336.348081] RDX: ffffe09cb0ce9b87 RSI:
0000000000000000 RDI: ffffe09cb0ce9bc0
2021-11-25 22:53:35 [ 8336.348082] RBP: ffff988e061f2710 R08:
0000000000031468 R09: 0000000000000020
2021-11-25 22:53:35 [ 8336.348083] R10: 00000000000000d7 R11:
0000000000000001 R12: ffff98bc2d211430
2021-11-25 22:53:35 [ 8336.348083] R13: ffff988ea9769ad8 R14:
ffff988e061f26c0 R15: 473ea8b1095ffa30
2021-11-25 22:53:35 [ 8336.348084] FS:  0000000000000000(0000)
GS:ffff98ccee580000(0000) knlGS:0000000000000000
2021-11-25 22:53:35 [ 8336.348085] CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
2021-11-25 22:53:35 [ 8336.348086] CR2: 00007fcf5006aac8 CR3:
000000016ad60000 CR4: 0000000000350ee0
2021-11-25 22:53:35 [ 8336.348087] Kernel panic - not syncing: Fatal
exception
2021-11-25 22:53:35 [ 8336.349344] Kernel Offset: 0x35200000 from
0xffffffff81000000 (relocation range:
0xffffffff80000000-0xffffffffbfffffff)
2021-11-25 22:53:35 [ 8337.778774] ---[ end Kernel panic - not syncing:
Fatal exception ]---


The hung mds:

2021-11-25 21:20:43 [  205.144673] libceph: client80562712 fsid
31450363-7461-457f-be77-68dc1740f718
2021-11-25 22:12:39 [ 3321.421663] INFO: task node_exporter:4007 blocked
for more than 122 seconds.
2021-11-25 22:12:39 [ 3321.428713]       Tainted: G           O
       5.10.78-2.el8.x86_64 #1
2021-11-25 22:12:39 [ 3321.435152] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-11-25 22:12:39 [ 3321.442976] task:node_exporter   state:D stack:
     0 pid: 4007 ppid:     1 flags:0x00004080
2021-11-25 22:12:39 [ 3321.451326] Call Trace:
2021-11-25 22:12:39 [ 3321.453784]  __schedule+0x291/0x7a0
2021-11-25 22:12:39 [ 3321.457277]  schedule+0x3c/0xa0
2021-11-25 22:12:39 [ 3321.460422]  rwsem_down_read_slowpath+0x2f6/0x4a0
2021-11-25 22:12:39 [ 3321.465196]  ceph_quota_update_statfs+0x28/0x140
[ceph]
2021-11-25 22:12:39 [ 3321.470434]  ceph_statfs+0x13f/0x160 [ceph]
2021-11-25 22:12:39 [ 3321.474625]  statfs_by_dentry+0x67/0x90
2021-11-25 22:12:39 [ 3321.478468]  vfs_statfs+0x16/0xd0
2021-11-25 22:12:39 [ 3321.481786]  user_statfs+0x54/0xa0
2021-11-25 22:12:39 [ 3321.485198]  __do_sys_statfs+0x20/0x50
2021-11-25 22:12:39 [ 3321.488955]  do_syscall_64+0x33/0x40
2021-11-25 22:12:39 [ 3321.492695]
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
2021-11-25 22:12:39 [ 3321.497749] RIP: 0033:0x4b25fb
2021-11-25 22:12:39 [ 3321.500811] RSP: 002b:000000c0014df418 EFLAGS:
00000212 ORIG_RAX: 0000000000000089
2021-11-25 22:12:39 [ 3321.508380] RAX: ffffffffffffffda RBX:
000000c000043000 RCX: 00000000004b25fb
2021-11-25 22:12:39 [ 3321.515517] RDX: 0000000000000000 RSI:
000000c0014df570 RDI: 000000c000cae350
2021-11-25 22:12:39 [ 3321.522648] RBP: 000000c0014df478 R08:
0000000000ab8601 R09: 0000000000000001
2021-11-25 22:12:39 [ 3321.529781] R10: 000000c000cae350 R11:
0000000000000212 R12: 0000000000000036
2021-11-25 22:12:39 [ 3321.536915] R13: 0000000000000035 R14:
0000000000000200 R15: 0000000000000055
2021-11-25 22:12:39 [ 3321.544105] INFO: task System-1:5514 blocked for
more than 123 seconds.
2021-11-25 22:12:39 [ 3321.550723]       Tainted: G           O
       5.10.78-2.el8.x86_64 #1
2021-11-25 22:12:39 [ 3321.557160] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-11-25 22:12:39 [ 3321.564986] task:System-1        state:D stack:
     0 pid: 5514 ppid:  4294 flags:0x00004080
2021-11-25 22:12:39 [ 3321.573334] Call Trace:
2021-11-25 22:12:39 [ 3321.575784]  __schedule+0x291/0x7a0
2021-11-25 22:12:39 [ 3321.579276]  schedule+0x3c/0xa0
2021-11-25 22:12:39 [ 3321.582425]  rwsem_down_read_slowpath+0x2f6/0x4a0
2021-11-25 22:12:39 [ 3321.587135]  ceph_quota_update_statfs+0x28/0x140
[ceph]
2021-11-25 22:12:39 [ 3321.592369]  ceph_statfs+0x13f/0x160 [ceph]
2021-11-25 22:12:39 [ 3321.596735]  statfs_by_dentry+0x67/0x90
2021-11-25 22:12:39 [ 3321.600573]  vfs_statfs+0x16/0xd0
2021-11-25 22:12:39 [ 3321.603892]  user_statfs+0x54/0xa0
2021-11-25 22:12:39 [ 3321.607299]  __do_sys_statfs+0x20/0x50
2021-11-25 22:12:39 [ 3321.611051]  do_syscall_64+0x33/0x40
2021-11-25 22:12:39 [ 3321.614628]
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
2021-11-25 22:12:39 [ 3321.619683] RIP: 0033:0x7f8e8026bfdb
2021-11-25 22:12:39 [ 3321.623260] RSP: 002b:00007f8c9b1f36d8 EFLAGS:
00000246 ORIG_RAX: 0000000000000089
2021-11-25 22:12:39 [ 3321.630827] RAX: ffffffffffffffda RBX:
00007f8c680100f0 RCX: 00007f8e8026bfdb
2021-11-25 22:12:39 [ 3321.637964] RDX: 00007f8c680100f0 RSI:
00007f8c9b1f36e0 RDI: 00007f8c680100f0
2021-11-25 22:12:39 [ 3321.645100] RBP: 00007f8c9b1f36e0 R08:
0000000000000000 R09: 000000008b184681
2021-11-25 22:12:39 [ 3321.652233] R10: 00007f8e693f4ca5 R11:
0000000000000246 R12: 00007f8c9b1f3780
2021-11-25 22:12:39 [ 3321.659368] R13: 00007f8c680100f0 R14:
00007f8c9b1f3828 R15: 00007f8c6400b800
2021-11-25 22:12:39 [ 3321.666518] INFO: task healthcheck-sch:7738
blocked for more than 123 seconds.
2021-11-25 22:12:39 [ 3321.673738]       Tainted: G           O
       5.10.78-2.el8.x86_64 #1
2021-11-25 22:12:39 [ 3321.680173] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-11-25 22:12:39 [ 3321.687999] task:healthcheck-sch state:D stack:
     0 pid: 7738 ppid:  4294 flags:0x00000080
2021-11-25 22:12:39 [ 3321.696346] Call Trace:
2021-11-25 22:12:39 [ 3321.698802]  __schedule+0x291/0x7a0
2021-11-25 22:12:39 [ 3321.702295]  schedule+0x3c/0xa0
2021-11-25 22:12:39 [ 3321.705438]  rwsem_down_write_slowpath+0x2f0/0x4a0
2021-11-25 22:12:39 [ 3321.710235]  ? handle_mm_fault+0xb1f/0xb70
2021-11-25 22:12:39 [ 3321.714332]  do_unlinkat+0x140/0x300
2021-11-25 22:12:39 [ 3321.717912]  do_syscall_64+0x33/0x40
2021-11-25 22:12:39 [ 3321.721489]
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
2021-11-25 22:12:39 [ 3321.726539] RIP: 0033:0x7f8e8026e16b
2021-11-25 22:12:39 [ 3321.730119] RSP: 002b:00007f8c98c6b0f8 EFLAGS:
00000206 ORIG_RAX: 0000000000000057
2021-11-25 22:12:39 [ 3321.737687] RAX: ffffffffffffffda RBX:
00007f8c70dca348 RCX: 00007f8e8026e16b
2021-11-25 22:12:39 [ 3321.744818] RDX: 00007f8c28002ec0 RSI:
00007f8c98c6b188 RDI: 00007f8c28002ec0
2021-11-25 22:12:39 [ 3321.751951] RBP: 00007f8c98c6b110 R08:
0000000081e598f9 R09: 000000008b005a8c
2021-11-25 22:12:39 [ 3321.759096] R10: 00007f8e608a16e1 R11:
0000000000000206 R12: 0000000000000000
2021-11-25 22:12:39 [ 3321.766229] R13: 00007f8e21d7fa08 R14:
00007f8c98c6b1a0 R15: 00007f8c70dca000
2021-11-25 22:12:39 [ 3321.773369] INFO: task P2P Transfer - :53802
blocked for more than 123 seconds.
2021-11-25 22:12:39 [ 3321.780679]       Tainted: G           O
       5.10.78-2.el8.x86_64 #1
2021-11-25 22:12:39 [ 3321.787121] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-11-25 22:12:39 [ 3321.794948] task:P2P Transfer -  state:D stack:
     0 pid:53802 ppid:  4294 flags:0x00000082
2021-11-25 22:12:39 [ 3321.803294] Call Trace:
2021-11-25 22:12:39 [ 3321.805747]  __schedule+0x291/0x7a0
2021-11-25 22:12:39 [ 3321.809241]  schedule+0x3c/0xa0
2021-11-25 22:12:39 [ 3321.812383]  rwsem_down_read_slowpath+0x2f6/0x4a0
2021-11-25 22:12:39 [ 3321.817098]  check_quota_exceeded+0x64/0x220 [ceph]
2021-11-25 22:12:39 [ 3321.821986]  ceph_write_iter+0x1bf/0xc90 [ceph]
2021-11-25 22:12:39 [ 3321.826523]  ? tcp_recvmsg+0x63e/0xb80
2021-11-25 22:12:39 [ 3321.830279]  ? inet6_recvmsg+0x5e/0x110
2021-11-25 22:12:39 [ 3321.834123]  ? sock_recvmsg+0x1c/0x70
2021-11-25 22:12:39 [ 3321.837789]  ? new_sync_write+0x11f/0x1b0
2021-11-25 22:12:39 [ 3321.841801]  new_sync_write+0x11f/0x1b0
2021-11-25 22:12:39 [ 3321.845644]  vfs_write+0x1bd/0x270
2021-11-25 22:12:39 [ 3321.849046]  ksys_write+0x59/0xd0
2021-11-25 22:12:39 [ 3321.852365]  do_syscall_64+0x33/0x40
2021-11-25 22:12:39 [ 3321.855945]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
2021-11-25 22:12:39 [ 3321.860997] RIP: 0033:0x7f8e8096a847
2021-11-25 22:12:39 [ 3321.864578] RSP: 002b:00007f8a05c9c5c0 EFLAGS:
00000293 ORIG_RAX: 0000000000000001
2021-11-25 22:12:39 [ 3321.872145] RAX: ffffffffffffffda RBX:
0000000000000c4e RCX: 00007f8e8096a847
2021-11-25 22:12:39 [ 3321.879275] RDX: 0000000000002000 RSI:
00007f8904001bf0 RDI: 0000000000000c4e
2021-11-25 22:12:39 [ 3321.886410] RBP: 00007f8904001bf0 R08:
0000000000000000 R09: 000000043389c7e0
2021-11-25 22:12:39 [ 3321.893542] R10: 00000000000012b8 R11:
0000000000000293 R12: 0000000000002000
2021-11-25 22:12:39 [ 3321.900674] R13: 00007f8904001bf0 R14:
00007f8a05c9c650 R15: 00007f88e4015800
2021-11-25 22:12:39 [ 3321.907809] INFO: task vega_izum_si_03:57727
blocked for more than 123 seconds.
2021-11-25 22:12:39 [ 3321.915116]       Tainted: G           O
       5.10.78-2.el8.x86_64 #1
2021-11-25 22:12:39 [ 3321.921553] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-11-25 22:12:39 [ 3321.929376] task:vega_izum_si_03 state:D stack:
     0 pid:57727 ppid:  4294 flags:0x00000080
2021-11-25 22:12:39 [ 3321.937725] Call Trace:
2021-11-25 22:12:39 [ 3321.940176]  __schedule+0x291/0x7a0
2021-11-25 22:12:39 [ 3321.943675]  ? __touch_cap+0x1f/0xd0 [ceph]
2021-11-25 22:12:39 [ 3321.947865]  schedule+0x3c/0xa0
2021-11-25 22:12:39 [ 3321.951009]  rwsem_down_read_slowpath+0x2f6/0x4a0
2021-11-25 22:12:39 [ 3321.955716]  ? lookup_fast+0xae/0x150
2021-11-25 22:12:39 [ 3321.959390]  walk_component+0x129/0x1b0
2021-11-25 22:12:39 [ 3321.963231]  ? path_init+0x2ef/0x360
2021-11-25 22:12:39 [ 3321.966808]  path_lookupat.isra.42+0x67/0x140
2021-11-25 22:12:39 [ 3321.971171]  ? futex_wait+0x19a/0x230
2021-11-25 22:12:39 [ 3321.974834]  filename_lookup.part.56+0xa0/0x170
2021-11-25 22:12:39 [ 3321.979371]  ? __check_object_size+0x162/0x180
2021-11-25 22:12:39 [ 3321.983816]  ? strncpy_from_user+0x46/0x1e0
2021-11-25 22:12:39 [ 3321.988000]  vfs_statx+0x72/0x110
2021-11-25 22:12:39 [ 3321.991320]  __do_sys_newstat+0x39/0x70
2021-11-25 22:12:39 [ 3321.995160]  ?
syscall_trace_enter.isra.19+0x123/0x190
2021-11-25 22:12:39 [ 3322.000308]  do_syscall_64+0x33/0x40
2021-11-25 22:12:39 [ 3322.003887]
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
2021-11-25 22:12:39 [ 3322.008940] RIP: 0033:0x7f8e8026ba79
2021-11-25 22:12:39 [ 3322.012519] RSP: 002b:00007f89268e4048 EFLAGS:
00000246 ORIG_RAX: 0000000000000004
2021-11-25 22:12:39 [ 3322.020084] RAX: ffffffffffffffda RBX:
00007f89268e4050 RCX: 00007f8e8026ba79
2021-11-25 22:12:39 [ 3322.027222] RDX: 00007f89268e4050 RSI:
00007f89268e4050 RDI: 00007f88d80011e0
2021-11-25 22:12:39 [ 3322.034349] RBP: 00007f89268e4100 R08:
0000000000000000 R09: 000000008bad28f1
2021-11-25 22:12:39 [ 3322.041483] R10: 00007f8e687103a5 R11:
0000000000000246 R12: 00007f88d80011e0
2021-11-25 22:12:39 [ 3322.048614] R13: 00007f8c5409bb48 R14:
00007f89268e4118 R15: 00007f8c5409b800
2021-11-25 22:12:39 [ 3322.055760] INFO: task P2P Transfer - :58084
blocked for more than 123 seconds.
2021-11-25 22:12:39 [ 3322.063072]       Tainted: G           O
       5.10.78-2.el8.x86_64 #1
2021-11-25 22:12:39 [ 3322.069511] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-11-25 22:12:39 [ 3322.077335] task:P2P Transfer -  state:D stack:
     0 pid:58084 ppid:  4294 flags:0x00000082
2021-11-25 22:12:40 [ 3322.085681] Call Trace:
2021-11-25 22:12:40 [ 3322.088135]  __schedule+0x291/0x7a0
2021-11-25 22:12:40 [ 3322.091628]  schedule+0x3c/0xa0
2021-11-25 22:12:40 [ 3322.094773]  rwsem_down_read_slowpath+0x2f6/0x4a0
2021-11-25 22:12:40 [ 3322.099487]  check_quota_exceeded+0x64/0x220 [ceph]
2021-11-25 22:12:40 [ 3322.104372]  ceph_write_iter+0x1bf/0xc90 [ceph]
2021-11-25 22:12:40 [ 3322.108911]  ? tcp_recvmsg+0x63e/0xb80
2021-11-25 22:12:40 [ 3322.112661]  ? inet6_recvmsg+0x5e/0x110
2021-11-25 22:12:40 [ 3322.116501]  ? sock_recvmsg+0x1c/0x70
2021-11-25 22:12:40 [ 3322.120168]  ? new_sync_write+0x11f/0x1b0
2021-11-25 22:12:40 [ 3322.124181]  new_sync_write+0x11f/0x1b0
2021-11-25 22:12:40 [ 3322.128019]  vfs_write+0x1bd/0x270
2021-11-25 22:12:40 [ 3322.131427]  ksys_write+0x59/0xd0
2021-11-25 22:12:40 [ 3322.134743]  do_syscall_64+0x33/0x40
2021-11-25 22:12:40 [ 3322.138325]
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
2021-11-25 22:12:40 [ 3322.143377] RIP: 0033:0x7f8e8096a847
2021-11-25 22:12:40 [ 3322.146954] RSP: 002b:00007f8a072b05c0 EFLAGS:
00000293 ORIG_RAX: 0000000000000001
2021-11-25 22:12:40 [ 3322.154523] RAX: ffffffffffffffda RBX:
0000000000000c48 RCX: 00007f8e8096a847
2021-11-25 22:12:40 [ 3322.161657] RDX: 0000000000002000 RSI:
00007f89a48e91f0 RDI: 0000000000000c48
2021-11-25 22:12:40 [ 3322.168787] RBP: 00007f89a48e91f0 R08:
0000000000000000 R09: 000000043389c888
2021-11-25 22:12:40 [ 3322.175920] R10: 00000000000012b8 R11:
0000000000000293 R12: 0000000000002000
2021-11-25 22:12:40 [ 3322.183052] R13: 00007f89a48e91f0 R14:
00007f8a072b0650 R15: 00007f88d8003000
2021-11-25 22:12:40 [ 3322.190187] INFO: task P2P Transfer - :58302
blocked for more than 123 seconds.
2021-11-25 22:12:40 [ 3322.197495]       Tainted: G           O
       5.10.78-2.el8.x86_64 #1
2021-11-25 22:12:40 [ 3322.203931] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-11-25 22:12:40 [ 3322.211757] task:P2P Transfer -  state:D stack:
     0 pid:58302 ppid:  4294 flags:0x00000080
2021-11-25 22:12:40 [ 3322.220105] Call Trace:
2021-11-25 22:12:40 [ 3322.222558]  __schedule+0x291/0x7a0
2021-11-25 22:12:40 [ 3322.226057]  schedule+0x3c/0xa0
2021-11-25 22:12:40 [ 3322.229203]  rwsem_down_read_slowpath+0x2f6/0x4a0
2021-11-25 22:12:40 [ 3322.233918]  check_quota_exceeded+0x64/0x220 [ceph]
2021-11-25 22:12:40 [ 3322.238802]  ceph_write_iter+0x1bf/0xc90 [ceph]
2021-11-25 22:12:40 [ 3322.243341]  ? tcp_recvmsg+0x63e/0xb80
2021-11-25 22:12:40 [ 3322.247091]  ? inet6_recvmsg+0x5e/0x110
2021-11-25 22:12:40 [ 3322.250932]  ? new_sync_write+0x11f/0x1b0
2021-11-25 22:12:40 [ 3322.254944]  new_sync_write+0x11f/0x1b0
2021-11-25 22:12:40 [ 3322.258785]  vfs_write+0x1bd/0x270
2021-11-25 22:12:40 [ 3322.262188]  ksys_write+0x59/0xd0
2021-11-25 22:12:40 [ 3322.265510]  do_syscall_64+0x33/0x40
2021-11-25 22:12:40 [ 3322.269087]
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
2021-11-25 22:12:40 [ 3322.274154] RIP: 0033:0x7f8e8096a847
2021-11-25 22:12:40 [ 3322.277728] RSP: 002b:00007f8925bd75c0 EFLAGS:
00000293 ORIG_RAX: 0000000000000001
2021-11-25 22:12:40 [ 3322.285296] RAX: ffffffffffffffda RBX:
0000000000000c44 RCX: 00007f8e8096a847
2021-11-25 22:12:40 [ 3322.292427] RDX: 0000000000002000 RSI:
00007f88ac001af0 RDI: 0000000000000c44
2021-11-25 22:12:40 [ 3322.299561] RBP: 00007f88ac001af0 R08:
0000000000000000 R09: 000000043385aac8
2021-11-25 22:12:40 [ 3322.306691] R10: 00000000000012b8 R11:
0000000000000293 R12: 0000000000002000
2021-11-25 22:12:40 [ 3322.313823] R13: 00007f88ac001af0 R14:
00007f8925bd7650 R15: 00007f88d8007000
2021-11-25 22:12:40 [ 3322.320958] INFO: task P2P Transfer - :58781
blocked for more than 123 seconds.
2021-11-25 22:12:40 [ 3322.328266]       Tainted: G           O
       5.10.78-2.el8.x86_64 #1
2021-11-25 22:12:40 [ 3322.334704] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-11-25 22:12:40 [ 3322.342528] task:P2P Transfer -  state:D stack:
     0 pid:58781 ppid:  4294 flags:0x00000080
2021-11-25 22:12:40 [ 3322.350876] Call Trace:
2021-11-25 22:12:40 [ 3322.353327]  __schedule+0x291/0x7a0
2021-11-25 22:12:40 [ 3322.356820]  schedule+0x3c/0xa0
2021-11-25 22:12:40 [ 3322.359967]  rwsem_down_read_slowpath+0x2f6/0x4a0
2021-11-25 22:12:40 [ 3322.364677]  check_quota_exceeded+0x64/0x220 [ceph]
2021-11-25 22:12:40 [ 3322.369569]  ceph_write_iter+0x1bf/0xc90 [ceph]
2021-11-25 22:12:40 [ 3322.374103]  ? tcp_recvmsg+0x63e/0xb80
2021-11-25 22:12:40 [ 3322.377854]  ? inet6_recvmsg+0x5e/0x110
2021-11-25 22:12:40 [ 3322.381695]  ? sock_recvmsg+0x1c/0x70
2021-11-25 22:12:40 [ 3322.385361]  ? new_sync_write+0x11f/0x1b0
2021-11-25 22:12:40 [ 3322.389370]  new_sync_write+0x11f/0x1b0
2021-11-25 22:12:40 [ 3322.393211]  vfs_write+0x1bd/0x270
2021-11-25 22:12:40 [ 3322.396617]  ksys_write+0x59/0xd0
2021-11-25 22:12:40 [ 3322.399937]  do_syscall_64+0x33/0x40
2021-11-25 22:12:40 [ 3322.403517]
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
2021-11-25 22:12:40 [ 3322.408567] RIP: 0033:0x7f8e8096a847
2021-11-25 22:12:40 [ 3322.412194] RSP: 002b:00007f8a071af5c0 EFLAGS:
00000293 ORIG_RAX: 0000000000000001
2021-11-25 22:12:40 [ 3322.419769] RAX: ffffffffffffffda RBX:
0000000000000c4c RCX: 00007f8e8096a847
2021-11-25 22:12:40 [ 3322.426902] RDX: 0000000000002000 RSI:
00007f89e8303f20 RDI: 0000000000000c4c
2021-11-25 22:12:40 [ 3322.434030] RBP: 00007f89e8303f20 R08:
0000000000000000 R09: 000000045dae1508
2021-11-25 22:12:40 [ 3322.441164] R10: 00000000000012b8 R11:
0000000000000293 R12: 0000000000002000
2021-11-25 22:12:40 [ 3322.448298] R13: 00007f89e8303f20 R14:
00007f8a071af650 R15: 00007f88d800b000
2021-11-25 22:12:40 [ 3322.455446] INFO: task P2P Transfer - :58844
blocked for more than 123 seconds.
2021-11-25 22:12:40 [ 3322.462755]       Tainted: G           O
       5.10.78-2.el8.x86_64 #1
2021-11-25 22:12:40 [ 3322.469195] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-11-25 22:12:40 [ 3322.477021] task:P2P Transfer -  state:D stack:
     0 pid:58844 ppid:  4294 flags:0x00000080
2021-11-25 22:12:40 [ 3322.485366] Call Trace:
2021-11-25 22:12:40 [ 3322.487823]  __schedule+0x291/0x7a0
2021-11-25 22:12:40 [ 3322.491321]  schedule+0x3c/0xa0
2021-11-25 22:12:40 [ 3322.494469]  rwsem_down_write_slowpath+0x2f0/0x4a0
2021-11-25 22:12:40 [ 3322.499261]  path_openat+0x279/0x1050
2021-11-25 22:12:40 [ 3322.502930]  ? task_numa_fault+0x74c/0xae0
2021-11-25 22:12:40 [ 3322.507036]  do_filp_open+0x93/0x100
2021-11-25 22:12:40 [ 3322.510615]  ? handle_mm_fault+0xb1f/0xb70
2021-11-25 22:12:40 [ 3322.514714]  ? __check_object_size+0x162/0x180
2021-11-25 22:12:40 [ 3322.519159]  do_sys_openat2+0x21e/0x2d0
2021-11-25 22:12:40 [ 3322.523001]  do_sys_open+0x4b/0x80
2021-11-25 22:12:40 [ 3322.526415]  do_syscall_64+0x33/0x40
2021-11-25 22:12:40 [ 3322.530003]
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
2021-11-25 22:12:40 [ 3322.535052] RIP: 0033:0x7f8e8096b0d6
2021-11-25 22:12:40 [ 3322.538632] RSP: 002b:00007f8a070ae640 EFLAGS:
00000293 ORIG_RAX: 0000000000000101
2021-11-25 22:12:40 [ 3322.546201] RAX: ffffffffffffffda RBX:
000000045964b830 RCX: 00007f8e8096b0d6
2021-11-25 22:12:40 [ 3322.553342] RDX: 00000000000000c1 RSI:
00007f89d807bb40 RDI: 00000000ffffff9c
2021-11-25 22:12:40 [ 3322.560473] RBP: 00007f8a070ae6e0 R08:
0000000000000000 R09: 000000008b2c96e6
2021-11-25 22:12:40 [ 3322.567606] R10: 00000000000001b6 R11:
0000000000000293 R12: 00000000000001b6
2021-11-25 22:12:40 [ 3322.574739] R13: 00000000000000c1 R14:
00007f89d807bb40 R15: 00007f88d8008b48
2021-11-25 22:12:40 [ 3322.581885] INFO: task vega_izum_si_03:58847
blocked for more than 124 seconds.
2021-11-25 22:12:40 [ 3322.589193]       Tainted: G           O
       5.10.78-2.el8.x86_64 #1
2021-11-25 22:12:40 [ 3322.595634] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-11-25 22:12:40 [ 3322.603458] task:vega_izum_si_03 state:D stack:
     0 pid:58847 ppid:  4294 flags:0x00000080
2021-11-25 22:12:40 [ 3322.611806] Call Trace:
2021-11-25 22:12:40 [ 3322.614262]  __schedule+0x291/0x7a0
2021-11-25 22:12:40 [ 3322.617768]  ? __touch_cap+0x1f/0xd0 [ceph]
2021-11-25 22:12:40 [ 3322.621957]  schedule+0x3c/0xa0
2021-11-25 22:12:40 [ 3322.625099]  rwsem_down_read_slowpath+0x2f6/0x4a0
2021-11-25 22:12:40 [ 3322.629806]  ? lookup_fast+0xae/0x150
2021-11-25 22:12:40 [ 3322.633472]  walk_component+0x129/0x1b0
2021-11-25 22:12:40 [ 3322.637315]  ? path_init+0x2ef/0x360
2021-11-25 22:12:40 [ 3322.640902]  path_lookupat.isra.42+0x67/0x140
2021-11-25 22:12:40 [ 3322.645258]  filename_lookup.part.56+0xa0/0x170
2021-11-25 22:12:40 [ 3322.649793]  ? __check_object_size+0x162/0x180
2021-11-25 22:12:40 [ 3322.654238]  ? strncpy_from_user+0x46/0x1e0
2021-11-25 22:12:40 [ 3322.658422]  vfs_statx+0x72/0x110
2021-11-25 22:12:40 [ 3322.661740]  __do_sys_newstat+0x39/0x70
2021-11-25 22:12:40 [ 3322.665584]  ?
syscall_trace_enter.isra.19+0x123/0x190
2021-11-25 22:12:40 [ 3322.670722]  do_syscall_64+0x33/0x40
2021-11-25 22:12:40 [ 3322.674304]
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
2021-11-25 22:12:40 [ 3322.679375] RIP: 0033:0x7f8e8026ba79
2021-11-25 22:12:40 [ 3322.682949] RSP: 002b:00007f8a05d9d048 EFLAGS:
00000246 ORIG_RAX: 0000000000000004
2021-11-25 22:12:40 [ 3322.690517] RAX: ffffffffffffffda RBX:
00007f8a05d9d050 RCX: 00007f8e8026ba79
2021-11-25 22:12:40 [ 3322.697650] RDX: 00007f8a05d9d050 RSI:
00007f8a05d9d050 RDI: 00007f8900018220
2021-11-25 22:12:40 [ 3322.704783] RBP: 00007f8a05d9d100 R08:
0000000000000000 R09: 0000000459723280
2021-11-25 22:12:40 [ 3322.711917] R10: 00007f8e687103a5 R11:
0000000000000246 R12: 00007f8900018220
2021-11-25 22:12:40 [ 3322.719045] R13: 00007f8c540c7b48 R14:
00007f8a05d9d118 R15: 00007f8c540c7800
2021-11-25 22:13:46 [ 3388.045080] ceph: mds0 hung


--
_____________________________________________________________
     prof. dr. Andrej Filipcic,   E-mail:Andrej.Filipcic@xxxxxx
     Department of Experimental High Energy Physics - F9
     Jozef Stefan Institute, Jamova 39, P.o.Box 3000
     SI-1001 Ljubljana, Slovenia
     Tel.: +386-1-477-3674    Fax: +386-1-477-3166
-------------------------------------------------------------
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


--
_____________________________________________________________
   prof. dr. Andrej Filipcic,   E-mail: Andrej.Filipcic@xxxxxx
   Department of Experimental High Energy Physics - F9
   Jozef Stefan Institute, Jamova 39, P.o.Box 3000
   SI-1001 Ljubljana, Slovenia
   Tel.: +386-1-477-3674    Fax: +386-1-425-7074
-------------------------------------------------------------


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux