On 19/09/2022 23:32, j.rasakunasingam@xxxxxxxxxxxx wrote:
Hi, we have 3x controller and 6xstorage Ceph Cluster running. We use iscsi/tcmu runner (16.2.9) to connect VMware to Ceph. We face an issue, that we lost the connection to the iscsi gateways, that ESXi is connected not works properly. After restarting the servers it works again, but later the tcmu runner restart the docker container it self. The only thing what I found here is this from tcmu-runner: 2022-09-12 12:33:24.186 81 [ERROR] tcmur_cmdproc_thread:864: ppoll received unexpected revent: 0x19 2022-09-12 12:33:24.192 81 [ERROR] tcmur_cmdproc_thread:864: ppoll received unexpected revent: 0x19 2022-09-12 12:33:24.198 81 [ERROR] tcmur_cmdproc_thread:864: ppoll received unexpected revent: 0x19 2022-09-12 12:33:24.201 81 [ERROR] tcmur_cmdproc_thread:864: ppoll received unexpected revent: 0x19 2022-09-12 12:33:24.205 81 [ERROR] tcmur_cmdproc_thread:864: ppoll received unexpected revent: 0x19 2022-09-12 12:33:24.208 81 [ERROR] tcmur_cmdproc_thread:864: ppoll received unexpected revent: 0x19 2022-09-12 12:33:24.211 81 [ERROR] tcmur_cmdproc_thread:864: ppoll received unexpected revent: 0x19 2022-09-12 12:33:24.279 81 [ERROR] tcmu_cfgfs_set_str:294: Kernel does not support configfs file /sys/kernel/config/target/core/user_3/pool_ag.image_ag/action/block_dev. 2022-09-12 12:33:24.279 81 [ERROR] tcmu_cfgfs_set_str:294: Kernel does not support configfs file /sys/kernel/config/target/core/user_3/pool_ag.image_ag/action/block_dev.
I think your kernel is a little old. And without the 'block_dev' we cannot make sure the inflight IOs to finish before closing the device, and this will cause all the other errors in above and below.
Please update your kernel which will include: commit 892782caf19a97ccc95df51b3bb659ecacff986a Author: Mike Christie <mchristi@xxxxxxxxxx> Date: Tue Dec 19 04:03:58 2017 -0600 tcmu: allow userspace to reset ring This patch adds 2 tcmu attrs to block/unblock a device and reset the ring buffer. They are used when the userspace daemon has crashed or forced to shutdown while IO is executing. On restart, the daemon can block the device so new IO is not sent to userspace while it puts the ring in a clean state. Notes: The reset ring opreation is specific to tcmu, but the block one could be generic. I kept it tcmu specific, because it requires some extra locking/state checks in the main IO path and since other backend modules did not need this functionality I thought only tcmu should take the perf hit. Signed-off-by: Mike Christie <mchristi@xxxxxxxxxx> Signed-off-by: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>
2022-09-12 12:33:24.754 81 [ERROR] tcmur_cmdproc_thread:864: ppoll received unexpected revent: 0x19 2022-09-12 12:33:24.965 81 [ERROR] tcmur_cmdproc_thread:864: ppoll received unexpected revent: 0x19 2022-09-16 10:07:42.829 81 [ERROR] tcmu_rbd_service_status_update:140 rbd/pool_scutum.image_scutum: Could not update service status. (Err -107) 2022-09-16 10:07:42.829 81 [ERROR] __tcmu_report_event:173 rbd/pool_scutum.image_scutum: Could not report events. Error -107. 2022-09-16 10:08:51.126 81 [ERROR] tcmu_rbd_service_status_update:140 rbd/pool_scutum.image_scutum: Could not update service status. (Err -107) 2022-09-16 10:08:51.126 81 [ERROR] __tcmu_report_event:173 rbd/pool_scutum.image_scutum: Could not report events. Error -107. 2022-09-16 10:13:05.313 81 [ERROR] tcmu_rbd_service_status_update:140 rbd/pool_scutum.image_scutum: Could not update service status. (Err -107) 2022-09-16 10:13:05.313 81 [ERROR] __tcmu_report_event:173 rbd/pool_scutum.image_scutum: Could not report events. Error -107. 2022-09-16 10:20:23.229 81 [ERROR] tcmu_rbd_service_status_update:140 rbd/pool_scutum.image_scutum: Could not update service status. (Err -107) 2022-09-16 10:20:23.229 81 [ERROR] __tcmu_report_event:173 rbd/pool_scutum.image_scutum: Could not report events. Error -107. 2022-09-16 10:27:21.421 81 [ERROR] tcmu_rbd_service_status_update:140 rbd/pool_scutum.image_scutum: Could not update service status. (Err -107) 2022-09-16 10:27:21.422 81 [ERROR] __tcmu_report_event:173 rbd/pool_scutum.image_scutum: Could not report events. Error -107. 2022-09-16 10:28:02.669 81 [ERROR] tcmu_acquire_dev_lock:432 rbd/pool_desktop.image_desktop: Could not reopen device while taking lock. Err -16. 2022-09-16 10:29:47.049 81 [ERROR] tcmu_rbd_service_status_update:140 rbd/pool_ag.image_ag: Could not update service status. (Err -107) 2022-09-16 10:29:47.049 81 [ERROR] __tcmu_report_event:173 rbd/pool_ag.image_ag: Could not report events. Error -107. 2022-09-16 10:31:26.025 81 [ERROR] tcmu_rbd_service_status_update:140 rbd/pool_desktop.image_desktop: Could not update service status. (Err -107) 2022-09-16 10:31:26.025 81 [ERROR] __tcmu_report_event:173 rbd/pool_desktop.image_desktop: Could not report events. Error -107. 2022-09-16 10:48:23.553 81 [ERROR] tcmu_acquire_dev_lock:432 rbd/pool_desktop.image_desktop: Could not reopen device while taking lock. Err -16. 2022-09-16 10:49:58.223 81 [ERROR] tcmu_acquire_dev_lock:432 rbd/pool_desktop.image_desktop: Could not reopen device while taking lock. Err -16. 2022-09-16 10:54:06.798 81 [ERROR] tcmu_acquire_dev_lock:432 rbd/pool_desktop.image_desktop: Could not reopen device while taking lock. Err -16. 2022-09-16 10:56:48.497 81 [ERROR] tcmu_acquire_dev_lock:432 rbd/pool_desktop.image_desktop: Could not reopen device while taking lock. Err -16. 2022-09-16 10:59:48.393 81 [ERROR] tcmu_acquire_dev_lock:432 rbd/pool_desktop.image_desktop: Could not reopen device while taking lock. Err -16. 2022-09-16 11:01:28.993 81 [ERROR] tcmu_acquire_dev_lock:432 rbd/pool_desktop.image_desktop: Could not reopen device while taking lock. Err -16. 2022-09-16 11:09:42.599 81 [ERROR] tcmu_rbd_service_status_update:140 rbd/pool_desktop.image_desktop: Could not update service status. (Err -107) 2022-09-16 11:09:42.600 81 [ERROR] __tcmu_report_event:173 rbd/pool_desktop.image_desktop: Could not report events. Error -107. Thanks in advance. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx