Oh, sorry, I forgot to mention that all OSDs are with bluestore, so xfs mount options don't have any influence.
--
Best regards,
VMs have cache="none" by default, then I've tried "writethrough". No difference.
And aren't these rbd cache options enabled by default?
2017-11-07 18:45 GMT+05:00 Peter Maloney <peter.maloney@xxxxxxxxxxxxxxxxxxxx>:
I see nobarrier in there... Try without that. (unless that's just the bluestore xfs...then it probably won't change anything). And are the osds using bluestore?
And what cache options did you set in the VM config? It's dangerous to set writeback without also this in the client side ceph.conf:
rbd cache writethrough until flush = true
rbd_cache = true
On 11/07/17 14:36, Дробышевский, Владимир wrote:
Hello!
I've got a weird situation with rdb drive image reliability. I found that after hard-reset VM with ceph rbd drive from my new cluster become corrupted. I accidentally found it during HA tests of my new cloud cluster: after host reset VM was not able to boot again because of the virtual drive errors. The same result will be if you just kill qemu process (like would happened at host crash time).
First of all I thought it is a guest OS problem. But then I tried RouterOS (linux based), Linux, FreeBSD - all options show the same behavior.Then I blamed OpenNebula installation. For the test sake I've installed the latest Proxmox (5.1-36) to another server. The first subtest: I've created a VM in OpenNebula from predefined image, shut it down, then create Proxmox VM and pointed it to the image was created from OpenNebula.The second subtest: I've made a clean install from ISO with from Proxmox console, having previously created from Proxmox VM and drive image (of course, on the same ceph pool).Both results: unbootable VMs.
Finally I've made a clean install to the fresh VM with local LVM-backed drive image. And - guess what? - it survived qemu process kill.This is the first situation of this kind in my practice so I would like to ask for guidance. I believe that it is a cache problem of some kind, but I haven't faced it with earlier releases.
Some cluster details:
It's a small test cluster with 4 nodes, each has:
2x CPU E5-2665,128GB RAM1 OSD with Samsung sm863 1.92TB driveIB connection with IPoIB on QDR IB network
OS: Ubuntu 16.04 with 4.10 kernelceph: luminous 12.2.1
Client (kvm host) OSes:1. Ubuntu 16.04 (the same hosts as ceph cluster)2. Debian 9.1 in case of Proxmox
ceph.conf:
[global]fsid = 6a8ffc55-fa2e-48dc-a71c-647e1fff749b
public_network = 10.103.0.0/16cluster_network = 10.104.0.0/16
mon_initial_members = e001n01, e001n02, e001n03mon_host = 10.103.0.1,10.103.0.2,10.103.0.3
rbd default format = 2
auth_cluster_required = cephxauth_service_required = cephxauth_client_required = cephx
osd mount options = rw,noexec,nodev,noatime,nodiratime,nobarrier osd mount options xfs = rw,noexec,nodev,noatime,nodiratime,nobarrier osd_mkfs_type = xfs
bluestore fsck on mount = true
debug_lockdep = 0/0debug_context = 0/0debug_crush = 0/0debug_buffer = 0/0debug_timer = 0/0debug_filer = 0/0debug_objecter = 0/0debug_rados = 0/0debug_rbd = 0/0debug_journaler = 0/0debug_objectcatcher = 0/0debug_client = 0/0debug_osd = 0/0debug_optracker = 0/0debug_objclass = 0/0debug_filestore = 0/0debug_journal = 0/0debug_ms = 0/0debug_monc = 0/0debug_tp = 0/0debug_auth = 0/0debug_finisher = 0/0debug_heartbeatmap = 0/0debug_perfcounter = 0/0debug_asok = 0/0debug_throttle = 0/0debug_mon = 0/0debug_paxos = 0/0debug_rgw = 0/0
[osd]osd op threads = 4
osd disk threads = 2osd max backfills = 1
osd recovery threads = 1osd recovery max active = 1--
Best regards,Vladimir
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/ listinfo.cgi/ceph-users-ceph. com
-- -------------------------------------------- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Tel: +49 4152 889 300 Fax: +49 4152 889 333 E-mail: peter.maloney@brockmann- consult.de Internet: http://www.brockmann-consult.de --------------------------------------------
Best regards,
Vladimir
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com