Losing data in healthy cluster

Blade Doyle <blade.doyle@xxxxxxxxx> · Fri, 25 Mar 2016 16:47:11 -0700

Help, my Ceph cluster is losing data slowly over time.  I keep finding files
that are the same length as they should be, but all the content has been
lost & replaced by nulls.

Here is an example:

(from a backup I have the original file)

[root@blotter docker]# ls -lart
/backup/space/docker/ceph-monitor/ceph-w-monitor.py
/space/docker/ceph-monitor/ceph-w-monitor.py
-rwxrwxrwx 1 root root 7237 Mar 12 07:34
/backup/space/docker/ceph-monitor/ceph-w-monitor.py
-rwxrwxrwx 1 root root 7237 Mar 12 07:34
/space/docker/ceph-monitor/ceph-w-monitor.py

[root@blotter docker]# sum
/backup/space/docker/ceph-monitor/ceph-w-monitor.py 19803     8

[root@blotter docker]# sum /space/docker/ceph-monitor/ceph-w-monitor.py
00000     8

If I had to _guess_ I would blame a recent change to the writeback cache
tier layer.  I turned it off and flushed it last weekend....about the same
time I started to notice this data loss.

I disabled it using instructions from here: 
http://docs.ceph.com/docs/master/rados/operations/cache-tiering/

Basically, I just set it to "forward" and then "flushed it".
ceph osd tier cache-mode ssd_cache forward
and
rados -p ssd_cache cache-flush-evict-all

After that I removed the overlay.  But that failed (and still fails) with:

Finally, tried to remove the cache t from the backing pool, but that failed
(still fails) with:

$ ceph osd tier remove-overlay cephfs_data
Error EBUSY: pool 'cephfs_data' is in use by CephFS via its tier

At that point I thought, because I had set the cache-mode to "forward", it
would be safe to just leave it as is until I had time to debug further.

I should mention that after the cluster settled down and did some scrubbing,
there was one inconsistent page.  I ran a "ceph fix page xxx" command to
resolve that and the health was good again.

I can do some experimenting this weekend if somebody wants to help me
through it.  Otherwise I'll probably try to put the cache-tier back into
"writeback" to see if that helps.  If not, I'll recreate the entire ceph
cluster.

Thanks,
Blade.

P.S. My cluster is made of mixed ARM and x86_64..
$ ceph version
ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)

# ceph version
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)

etc...

PPS:

$ ceph df
GLOBAL:
    SIZE      AVAIL     RAW USED     %RAW USED 
    2456G     1492G         839G         34.16 
POOLS:
    NAME                ID     USED       %USED     MAX AVAIL     OBJECTS 
    rbd                 0        139G      5.66          185G       36499 
    cephfs_data         1        235G      9.59          185G      102883 
    cephfs_metadata     2      33642k         0          185G        5530 
    ssd_cache           4           0         0          370G           0 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com