Hi Andrej,
The upstream kernel has one commit:
commit 0078ea3b0566e3da09ae8e1e4fbfd708702f2876
Author: Jeff Layton <jlayton@xxxxxxxxxx>
Date: Tue Nov 9 09:54:49 2021 -0500
ceph: don't check for quotas on MDS stray dirs
玮文 胡 reported seeing the WARN_RATELIMIT pop when writing to an
inode that had been transplanted into the stray dir. The client was
trying to look up the quotarealm info from the parent and that tripped
the warning.
Change the ceph_vino_is_reserved helper to not throw a warning for
MDS stray directories (0x100 - 0x1ff), only for reserved dirs that
are not in that range.
Also, fix ceph_has_realms_with_quotas to return false when encountering
a reserved inode.
URL: https://tracker.ceph.com/issues/53180
Reported-by: Hu Weiwen <sehuww@xxxxxxxxxxxxxxxx>
Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
Reviewed-by: Luis Henriques <lhenriques@xxxxxxx>
Reviewed-by: Xiubo Li <xiubli@xxxxxxxxxx>
Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx>
It's not a bug, just a warning, you can safely ignore it.
Thanks.
On 8/16/22 7:39 PM, Andrej Filipcic wrote:
Hi,
we experienced massive node failures when a user with cephfs quota
exceeded submitted many jobs to a slurm cluster, home is on cephfs.
The nodes still work for some time, but they eventually freeze due to
too many stuck CPUs
Is this a kernel ceph client bug? running on 5.10.123, ceph cluster is
16.2.9.
Best regards,
Andrej
2022-08-15T20:08:01+02:00 cn0539 kernel: ------------[ cut here
]------------
2022-08-15T20:08:01+02:00 cn0539 kernel: Attempt to access reserved
inode number 0x101
2022-08-15T20:08:01+02:00 cn0539 kernel: WARNING: CPU: 172 PID:
4185848 at fs/ceph/super.h:547 __lookup_inode+0x161/0x180 [ceph]
2022-08-15T20:08:14+02:00 cn0539 kernel: Modules linked in: squashfs
loop overlay fuse ceph libceph mgc(O) lustre(O) lmv(O) mdc(O) fid(O)
lov(O) fld(O) osc(O) ko2iblnd(O) ptlrpc(O) obdclass(O) lnet(O)
libcfs(O) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd
grace nfs_ssc fscache rfkill ipmi_ssif nft_limit amd64_edac_mod
edac_mce_amd amd_energy nft_ct kvm_amd nf_conntrack
nf_defrag_ipv6 kvm nf_defrag_ipv4 irqbypass crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel rapl pcspkr nf_tables libcrc32c
nfnetlink sp5100_tco ccp acpi_ipmi k10temp i2c_piix4 ipmi_si
rdma_ucm(O) rdma_cm(O) iw_cm(O) acpi_cpufreq ib_ipoib(O) ib_cm(O)
ib_umad(O) sunrpc vfat fat ext4 mbcache jbd2 mlx5_ib(O) ib_uverbs(O)
ib_core(O) mlx5_core(O) mlxfw(O) pci_hyperv_intf crc32c_inte
l tls ahci nvme psample igb libahci mlxdevm(O) auxiliary(O) nvme_core
i2c_algo_bit libata t10_pi dca mlx_compat(O) pinctrl_amd xpmem(O)
ipmi_devintf ipmi_msghandler
2022-08-15T20:08:14+02:00 cn0539 kernel: CPU: 172 PID: 4185848 Comm:
slurm_script Tainted: G W O 5.10.123-2.el8.x86_64 #1
2022-08-15T20:08:16+02:00 cn0539 kernel: Hardware name: To be filled
by O.E.M. To be filled by O.E.M./CER, BIOS BIOS_RME090.22.37.001
10/05/2021
2022-08-15T20:08:17+02:00 cn0539 kernel: RIP:
0010:__lookup_inode+0x161/0x180 [ceph]
2022-08-15T20:08:18+02:00 cn0539 kernel: Code: dd 48 85 db 0f 85 27 ff
ff ff 45 85 e4 0f 89 5d ff ff ff 49 63 ec e9 16 ff ff ff 48 89 de 48
c7 c7 58 bb 40 c1 e8 1e 21 d8 d0 <0f> 0b e9 3f ff ff ff e8 53 3d 01 00
eb c6 be 03 00 00 00 e8 97 a2
2022-08-15T20:08:21+02:00 cn0539 kernel: RSP: 0018:ffffb6d8de33fc18
EFLAGS: 00010286
2022-08-15T20:08:22+02:00 cn0539 kernel: RAX: 0000000000000000 RBX:
0000000000000101 RCX: 0000000000000027
2022-08-15T20:08:23+02:00 cn0539 kernel: RDX: 0000000000000027 RSI:
ffff95f2afd207e0 RDI: ffff95f2afd207e8
2022-08-15T20:08:24+02:00 cn0539 kernel: RBP: ffff965345e568a0 R08:
0000000000000000 R09: c0000000fffeffff
2022-08-15T20:08:25+02:00 cn0539 kernel: R10: 0000000000000001 R11:
ffffb6d8de33fa20 R12: ffff959e55081aa8
2022-08-15T20:08:27+02:00 cn0539 kernel: R13: ffff965345e568a8 R14:
ffff9593ea333e00 R15: ffff959e55081a80
2022-08-15T20:08:28+02:00 cn0539 kernel: FS: 00007fbf7c8ba740(0000)
GS:ffff95f2afd00000(0000) knlGS:0000000000000000
2022-08-15T20:08:29+02:00 cn0539 kernel: CS: 0010 DS: 0000 ES: 0000
CR0: 0000000080050033
2022-08-15T20:08:30+02:00 cn0539 kernel: CR2: 0000564324b8a588 CR3:
0000004d51150000 CR4: 0000000000150ee0
2022-08-15T20:08:31+02:00 cn0539 kernel: Call Trace:
2022-08-15T20:08:31+02:00 cn0539 kernel: ? __do_request+0x3f0/0x450
[ceph]
2022-08-15T20:08:32+02:00 cn0539 kernel: ceph_lookup_inode+0xa/0x30
[ceph]
2022-08-15T20:08:34+02:00 cn0539 kernel:
lookup_quotarealm_inode.isra.9+0x188/0x210 [ceph]
2022-08-15T20:08:34+02:00 cn0539 kernel:
check_quota_exceeded+0x1bc/0x220 [ceph]
2022-08-15T20:08:34+02:00 cn0539 kernel: ceph_write_iter+0x1bf/0xc90
[ceph]
2022-08-15T20:08:35+02:00 cn0539 kernel: ? path_openat+0x666/0x1050
2022-08-15T20:08:36+02:00 cn0539 kernel: ? __touch_cap+0x1f/0xd0 [ceph]
2022-08-15T20:08:36+02:00 cn0539 kernel: ?
ptep_set_access_flags+0x23/0x30
2022-08-15T20:08:37+02:00 cn0539 kernel: ? wp_page_reuse+0x5f/0x70
2022-08-15T20:08:38+02:00 cn0539 kernel: ? new_sync_write+0x11f/0x1b0
2022-08-15T20:08:38+02:00 cn0539 kernel: new_sync_write+0x11f/0x1b0
2022-08-15T20:08:39+02:00 cn0539 kernel: vfs_write+0x1bd/0x270
2022-08-15T20:08:40+02:00 cn0539 kernel: ksys_write+0x59/0xd0
2022-08-15T20:08:40+02:00 cn0539 kernel: do_syscall_64+0x33/0x40
2022-08-15T20:08:41+02:00 cn0539 kernel:
entry_SYSCALL_64_after_hwframe+0x44/0xa9
2022-08-15T20:08:41+02:00 cn0539 kernel: RIP: 0033:0x7fbf7bfc65a8
2022-08-15T20:08:42+02:00 cn0539 kernel: Code: 89 02 48 c7 c0 ff ff ff
ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 f5 3f 2a 00 8b 00
85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80
00 00 00 00 41 54 49 89 d4 55
2022-08-15T20:08:45+02:00 cn0539 kernel: RSP: 002b:00007ffcc4ad6dd8
EFLAGS: 00000246 ORIG_RAX: 0000000000000001
2022-08-15T20:08:46+02:00 cn0539 kernel: RAX: ffffffffffffffda RBX:
0000000000000417 RCX: 00007fbf7bfc65a8
2022-08-15T20:08:47+02:00 cn0539 kernel: RDX: 0000000000000417 RSI:
0000564324baa470 RDI: 0000000000000004
2022-08-15T20:08:48+02:00 cn0539 kernel: RBP: 0000564324baa470 R08:
0000000000000008 R09: 00224b5341545f52
2022-08-15T20:08:49+02:00 cn0539 kernel: R10: 0000000000000025 R11:
0000000000000246 R12: 0000564324b9cf50
2022-08-15T20:08:51+02:00 cn0539 kernel: R13: 0000000000000000 R14:
0000564324ba6200 R15: 0000564324b9cf50
2022-08-15T20:08:52+02:00 cn0539 kernel: ---[ end trace
a655820d09b78154 ]---
2022-08-15T20:09:58+02:00 cn0539 kernel: mlx5_core 0000:61:00.0:
mlx5_cmd_out_err:800:(pid 4155261): MAD_IFC(0x50d) op_mod(0x0) failed,
status bad packet (discarded)(0x30), syndrome (0xea9eb5), err(-22)
2022-08-15T20:09:58+02:00 cn0539 kernel: mlx5_core 0000:61:00.0:
mlx5_cmd_out_err:800:(pid 4155261): MAD_IFC(0x50d) op_mod(0x0) failed,
status bad packet (discarded)(0x30), syndrome (0xea9eb5), err(-22)
2022-08-15T20:10:12+02:00 cn0539 kernel: ------------[ cut here
]------------
2022-08-15T20:10:12+02:00 cn0539 kernel: Attempt to access reserved
inode number 0x101
2022-08-15T20:10:12+02:00 cn0539 kernel: WARNING: CPU: 78 PID: 14675
at fs/ceph/super.h:547 __lookup_inode+0x161/0x180 [ceph]
2022-08-15T20:10:26+02:00 cn0539 kernel: Modules linked in: squashfs
loop overlay fuse ceph libceph mgc(O) lustre(O) lmv(O) mdc(O) fid(O)
lov(O) fld(O) osc(O) ko2iblnd(O) ptlrpc(O) obdclass(O) lnet(O)
libcfs(O) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd
grace nfs_ssc fscache rfkill ipmi_ssif nft_limit amd64_edac_mod
edac_mce_amd amd_energy nft_ct kvm_amd nf_conntrack
nf_defrag_ipv6 kvm nf_defrag_ipv4 irqbypass crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel rapl pcspkr nf_tables libcrc32c
nfnetlink sp5100_tco ccp acpi_ipmi k10temp i2c_piix4 ipmi_si
rdma_ucm(O) rdma_cm(O) iw_cm(O) acpi_cpufreq ib_ipoib(O) ib_cm(O)
ib_umad(O) sunrpc vfat fat ext4 mbcache jbd2 mlx5_ib(O) ib_uverbs(O)
ib_core(O) mlx5_core(O) mlxfw(O) pci_hyperv_intf crc32c_inte
l tls ahci nvme psample igb libahci mlxdevm(O) auxiliary(O) nvme_core
i2c_algo_bit libata t10_pi dca mlx_compat(O) pinctrl_amd xpmem(O)
ipmi_devintf ipmi_msghandler
2022-08-15T20:10:26+02:00 cn0539 kernel: CPU: 78 PID: 14675 Comm:
slurm_script Tainted: G W O 5.10.123-2.el8.x86_64 #1
2022-08-15T20:10:27+02:00 cn0539 kernel: Hardware name: To be filled
by O.E.M. To be filled by O.E.M./CER, BIOS BIOS_RME090.22.37.001
10/05/2021
2022-08-15T20:10:29+02:00 cn0539 kernel: RIP:
0010:__lookup_inode+0x161/0x180 [ceph]
2022-08-15T20:10:30+02:00 cn0539 kernel: Code: dd 48 85 db 0f 85 27 ff
ff ff 45 85 e4 0f 89 5d ff ff ff 49 63 ec e9 16 ff ff ff 48 89 de 48
c7 c7 58 bb 40 c1 e8 1e 21 d8 d0 <0f> 0b e9 3f ff ff ff e8 53 3d 01 00
eb c6 be 03 00 00 00 e8 97 a2
2022-08-15T20:10:33+02:00 cn0539 kernel: RSP: 0018:ffffb6d8d2ab7c18
EFLAGS: 00010286
2022-08-15T20:10:33+02:00 cn0539 kernel: RAX: 0000000000000000 RBX:
0000000000000101 RCX: 0000000000000027
2022-08-15T20:10:35+02:00 cn0539 kernel: RDX: 0000000000000027 RSI:
ffff9632af9a07e0 RDI: ffff9632af9a07e8
2022-08-15T20:10:36+02:00 cn0539 kernel: RBP: ffff965345e568a0 R08:
0000000000000000 R09: c0000000fffeffff
2022-08-15T20:10:37+02:00 cn0539 kernel: R10: 0000000000000001 R11:
ffffb6d8d2ab7a20 R12: ffff959e55081aa8
2022-08-15T20:10:38+02:00 cn0539 kernel: R13: ffff965345e568a8 R14:
ffff9593f4994600 R15: ffff959e55081a80
2022-08-15T20:10:39+02:00 cn0539 kernel: FS: 00007f660e249740(0000)
GS:ffff9632af980000(0000) knlGS:0000000000000000
2022-08-15T20:10:40+02:00 cn0539 kernel: CS: 0010 DS: 0000 ES: 0000
CR0: 0000000080050033
2022-08-15T20:10:41+02:00 cn0539 kernel: CR2: 000055d6b3db5588 CR3:
0000008a75ce8000 CR4: 0000000000150ee0
2022-08-15T20:10:42+02:00 cn0539 kernel: Call Trace:
2022-08-15T20:10:43+02:00 cn0539 kernel: ? __do_request+0x3f0/0x450
[ceph]
2022-08-15T20:10:43+02:00 cn0539 kernel: ceph_lookup_inode+0xa/0x30
[ceph]
2022-08-15T20:10:44+02:00 cn0539 kernel:
lookup_quotarealm_inode.isra.9+0x188/0x210 [ceph]
2022-08-15T20:10:45+02:00 cn0539 kernel:
check_quota_exceeded+0x1bc/0x220 [ceph]
2022-08-15T20:10:46+02:00 cn0539 kernel: ceph_write_iter+0x1bf/0xc90
[ceph]
2022-08-15T20:10:47+02:00 cn0539 kernel: ? path_openat+0x666/0x1050
2022-08-15T20:10:47+02:00 cn0539 kernel: ? __do_request+0x3f0/0x450
[ceph]
2022-08-15T20:10:48+02:00 cn0539 kernel: ?
__ceph_put_cap_refs+0x30/0x380 [ceph]
2022-08-15T20:10:49+02:00 cn0539 kernel: ?
ptep_set_access_flags+0x23/0x30
2022-08-15T20:10:49+02:00 cn0539 kernel: ? wp_page_reuse+0x5f/0x70
2022-08-15T20:10:50+02:00 cn0539 kernel: ? new_sync_write+0x11f/0x1b0
2022-08-15T20:10:51+02:00 cn0539 kernel: new_sync_write+0x11f/0x1b0
2022-08-15T20:10:51+02:00 cn0539 kernel: vfs_write+0x1bd/0x270
2022-08-15T20:10:52+02:00 cn0539 kernel: ksys_write+0x59/0xd0
2022-08-15T20:10:52+02:00 cn0539 kernel: do_syscall_64+0x33/0x40
2022-08-15T20:10:53+02:00 cn0539 kernel:
entry_SYSCALL_64_after_hwframe+0x44/0xa9
2022-08-15T20:10:54+02:00 cn0539 kernel: RIP: 0033:0x7f660d9555a8
2022-08-15T20:10:54+02:00 cn0539 kernel: Code: 89 02 48 c7 c0 ff ff ff
ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 f5 3f 2a 00 8b 00
85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80
00 00 00 00 41 54 49 89 d4 55
2022-08-15T20:10:57+02:00 cn0539 kernel: RSP: 002b:00007ffe2286c368
EFLAGS: 00000246 ORIG_RAX: 0000000000000001
2022-08-15T20:10:58+02:00 cn0539 kernel: RAX: ffffffffffffffda RBX:
0000000000000417 RCX: 00007f660d9555a8
2022-08-15T20:10:59+02:00 cn0539 kernel: RDX: 0000000000000417 RSI:
000055d6b3dd5470 RDI: 0000000000000004
2022-08-15T20:11:01+02:00 cn0539 kernel: RBP: 000055d6b3dd5470 R08:
0000000000000008 R09: 00224b5341545f52
2022-08-15T20:11:02+02:00 cn0539 kernel: R10: 0000000000000025 R11:
0000000000000246 R12: 000055d6b3dc7f50
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx