Hello everyone,
We had an incident regarding a client which reboot after experiencing
some issues with a ceph cluster.
The other clients who consume RBD images from the same ceph cluster
showed and error at the time of the reboot in logs related to libceph.
The errors looks like this:
Dec 10 21:29:52 xxxx kernel: [5830277.680860] WARNING: CPU: 15 PID: 8113
at net/ceph/osd_client.c:490 request_reinit+0x141/0x180 [libceph]
Dec 10 21:29:52 xxxx kernel: [5830277.691032] Modules linked in:
nfnetlink_queue bluetooth ocfs2 quota_tree binfmt_misc tcp_diag
inet_diag veth ip_set ip6table_filter ip6_tables xt_nat xt_tcpudp
xt_multiport xt_conntrack xt_addrtype iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack rbd libceph ocfs2_dlmfs
ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue bonding
softdog iptable_filter nfnetlink_log nfnetlink intel_rapl sb_edac
edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm
irqbypass crct10dif_pclmul crc32_pclmul ipmi_ssif ghash_clmulni_intel
pcbc ast ttm aesni_intel aes_x86_64 drm_kms_helper crypto_simd
glue_helper snd_pcm cryptd drm snd_timer snd fb_sys_fops intel_cstate
syscopyarea input_leds soundcore joydev sysfillrect intel_rapl_perf
sysimgblt mei_me pcspkr mei ioatdma
Dec 10 21:29:52 xxxx kernel: [5830277.765547] lpc_ich shpchp wmi
ipmi_si ipmi_devintf ipmi_msghandler nfit acpi_pad acpi_power_meter
mac_hid vhost_net vhost macvtap macvlan ib_iser rdma_cm iw_cm ib_cm
ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
sunrpc ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0
multipath linear raid1 hid_generic usbkbd usbmouse usbhid hid igb ixgbe
i2c_algo_bit dca ptp ahci pps_core i2c_i801 libahci mdio fjes [last
unloaded: quota_tree]
Dec 10 21:29:52 xxxx kernel: [5830277.816275] CPU: 15 PID: 8113 Comm:
kworker/15:0 Tainted: G W 4.10.17-1-pve #1
Dec 10 21:29:52 xxxx kernel: [5830277.825564] Hardware name: Supermicro
SYS-1028U-TR4T+/X10DRU-i+, BIOS 2.0c 04/21/2017
Dec 10 21:29:52 xxxx kernel: [5830277.834272] Workqueue: events
handle_timeout [libceph]
Dec 10 21:29:52 xxxx kernel: [5830277.840307] Call Trace:
Dec 10 21:29:52 xxxx kernel: [5830277.843620] dump_stack+0x63/0x81
Dec 10 21:29:52 xxxx kernel: [5830277.847846] __warn+0xcb/0xf0
Dec 10 21:29:52 xxxx kernel: [5830277.851758] warn_slowpath_null+0x1d/0x20
Dec 10 21:29:52 xxxx kernel: [5830277.856798] request_reinit+0x141/0x180
[libceph]
Dec 10 21:29:52 xxxx kernel: [5830277.862403] handle_timeout+0x307/0x5b0
[libceph]
Dec 10 21:29:52 xxxx kernel: [5830277.868116] process_one_work+0x1fc/0x4b0
Dec 10 21:29:52 xxxx kernel: [5830277.873069] worker_thread+0x4b/0x500
Dec 10 21:29:52 xxxx kernel: [5830277.877561] kthread+0x109/0x140
Dec 10 21:29:52 xxxx kernel: [5830277.881720] ?
process_one_work+0x4b0/0x4b0
Dec 10 21:29:52 xxxx kernel: [5830277.886851] ?
kthread_create_on_node+0x60/0x60
Dec 10 21:29:52 xxxx kernel: [5830277.892323] ret_from_fork+0x2c/0x40
Dec 10 21:29:52 xxxx kernel: [5830277.896939] ---[ end trace
afd30825d5ecd451 ]---
I wonder if this is a bug in KRBD.
We are using ceph 10.2.5 in the ceph clients, but our ceph cluster is
10.2.9.
Please let me know if you need more information about our environment,
Kind regards,
--
Fernando Cid O.
Ingeniero de Operaciones
AltaVoz S.A.
http://www.altavoz.net
Viña del Mar, Valparaiso:
2 Poniente 355 of 53
+56 32 276 8060
Santiago:
San Pío X 2460, oficina 304, Providencia
+56 2 2585 4264
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com