Re: ceph-backed VM drive became corrupted after unexpected VM termination

Дробышевский, Владимир <vlad@xxxxxxxxxx> · Tue, 7 Nov 2017 18:55:03 +0500

Oh, sorry, I forgot to mention that all OSDs are with bluestore, so xfs mount options don't have any influence.
VMs have cache="none" by default, then I've tried "writethrough". No difference.

And aren't these rbd cache options enabled by default?

2017-11-07 18:45 GMT+05:00 Peter Maloney <peter.maloney@xxxxxxxxxxxxxxxxxxxx>:

    I see nobarrier in there... Try without
      that. (unless that's just the bluestore xfs...then it probably
      won't change anything). And are the osds using bluestore?

      And what cache options did you set in the VM config? It's
      dangerous to set writeback without also this in the client side
      ceph.conf:

      rbd cache writethrough until flush = true

      rbd_cache = true

      On 11/07/17 14:36, Дробышевский, Владимир wrote:

      Hello!

          I've got a weird situation with rdb drive image
          reliability. I found that after hard-reset VM with ceph rbd
          drive from my new cluster become corrupted. I accidentally
          found it during HA tests of my new cloud cluster: after host
          reset VM was not able to boot again because of the virtual
          drive errors. The same result will be if you just kill qemu
          process (like would happened at host crash time).

          First of all I thought it is a guest OS problem. But then
          I tried RouterOS (linux based), Linux, FreeBSD - all options
          show the same behavior. 
          Then I blamed OpenNebula installation. For the test sake
          I've installed the latest Proxmox (5.1-36) to another server.
          The first subtest: I've created a VM in OpenNebula from
          predefined image, shut it down, then create Proxmox VM and
          pointed it to the image was created from OpenNebula.
        The second subtest: I've made a clean install from ISO with
          from Proxmox console, having previously created from Proxmox
          VM and drive image (of course, on the same ceph pool).
          Both results: unbootable VMs.

          Finally I've made a clean install to the fresh VM with
          local LVM-backed drive image. And - guess what? - it survived
          qemu process kill.

            This is the first situation of this kind in my practice
            so I would like to ask for guidance. I believe that it is a
            cache problem of some kind, but I haven't faced it with
            earlier releases.

            Some cluster details:

            It's a small test cluster with 4 nodes, each has:

            2x CPU E5-2665,
            128GB RAM
            1 OSD with Samsung sm863 1.92TB drive
            IB connection with IPoIB on QDR IB network

            OS: Ubuntu 16.04 with 4.10 kernel
            ceph: luminous 12.2.1

            Client (kvm host) OSes: 
            1. Ubuntu 16.04 (the same hosts as ceph cluster)
            2. Debian 9.1 in case of Proxmox

          ceph.conf:

            [global]
            fsid = 6a8ffc55-fa2e-48dc-a71c-647e1fff749b

            public_network = 10.103.0.0/16
            cluster_network = 10.104.0.0/16

            mon_initial_members = e001n01, e001n02, e001n03
            mon_host = 10.103.0.1,10.103.0.2,10.103.0.3

            rbd default format = 2

            auth_cluster_required = cephx
            auth_service_required = cephx
            auth_client_required = cephx

            osd mount options = rw,noexec,nodev,noatime,nodiratime,nobarrier
            osd mount options xfs = rw,noexec,nodev,noatime,nodiratime,nobarrier
            osd_mkfs_type = xfs

            bluestore fsck on mount = true

              debug_lockdep = 0/0
              debug_context = 0/0
              debug_crush = 0/0
              debug_buffer = 0/0
              debug_timer = 0/0
              debug_filer = 0/0
              debug_objecter = 0/0
              debug_rados = 0/0
              debug_rbd = 0/0
              debug_journaler = 0/0
              debug_objectcatcher = 0/0
              debug_client = 0/0
              debug_osd = 0/0
              debug_optracker = 0/0
              debug_objclass = 0/0
              debug_filestore = 0/0
              debug_journal = 0/0
              debug_ms = 0/0
              debug_monc = 0/0
              debug_tp = 0/0
              debug_auth = 0/0
              debug_finisher = 0/0
              debug_heartbeatmap = 0/0
              debug_perfcounter = 0/0
              debug_asok = 0/0
              debug_throttle = 0/0
              debug_mon = 0/0
              debug_paxos = 0/0
              debug_rgw = 0/0

              [osd]
              osd op threads = 4

              osd disk threads = 2
              osd max backfills = 1

              osd recovery threads = 1
              osd recovery max active = 1

            -- 

                          Best regards,
                        Vladimir

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

    -- 

--------------------------------------------
Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: peter.maloney@brockmann-consult.de
Internet: http://www.brockmann-consult.de
--------------------------------------------

-- 

Best regards,
Vladimir

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com