I have a problem that after an OSD host lost connection to the
sync/cluster rear network for many hours (the public network was
online), a test VM using RBD cant overwrite its files. I can create a
new file inside it just fine, but not overwrite it, the process just hangs.
The VM's disk is on an erasure coded data pool and a replicated pool in
front of it. EC overwrites is on for the pool.
The custer consists of 5 hosts and 4 OSDs on each, and separate hosts
for compute. There is a public and separate cluster network, separated.
In this case, the AOC cable to the cluster network went link down on a
host and it had to be replaced and the host was rebooted. Recovery took
about a week to complete. The host was half-down for about 12 hours like
I have some other VMs as well with images in the same pool (totally 4),
and they seem to work fine, it is just this one that cant overwrite.
I'm thinking there is somehow something wrong with just this image?
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx