Hi Nigel, On Fri, 1 Dec 2017 13:32:43 +0000, nigel davies wrote: > Ceph version 10.2.5 > > i have had an Ceph cluster going for a few months, with iscsi servers that > are linked to Ceph by RBD. > > All of an sudden i am starting the ESXI server will louse the isscsi data > store (disk space goes to 0 B) and i only fix this by rebooting the ISCSI > server > > When checking syslogs on the iscsi server i get a loads of errors like > > SENDING TMR_TASK_DOES_NOT_EXIST for ref_tag: XXXX > like 100+ lines > > i looked at the logs and cant see anything saying hung io or an OSD come > out and back in. > > does any one have any susgestions on whats going on?? The TMR_TASK_DOES_NOT_EXIST error indicates that your initiator (ESXi) is attempting to abort outstanding I/Os. ESXi is pretty latency sensitive, so I'd guess that the abort-task requests are being sent by the initiator after tripping a local I/O timeout. Your vmkernel logs should shed a bit more light on this. Cheers, David _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com