CephFS dropping data with rsync?

Hector Martin <hector@xxxxxxxxxxxxxx> · Sat, 16 Jun 2018 13:04:10 +0900

I'm at a loss as to what happened here.

I'm testing a single-node Ceph "cluster" as a replacement for RAID and
traditional filesystems. 9 4TB HDDs, one single (underpowered) server.
Running Luminous 12.2.5 with BlueStore OSDs.

I set up CephFS on a k=6,m=2 EC pool, mounted it via FUSE, and ran an
rsync from a traditional FS storing a few terabytes of data, including
some huge (>1TB) files. It all was going well, but then this happened:

https://mrcn.st/t/cephfs_dataloss.png

At around 9:13, throughput crashed to 0 and cluster usage dropped by
~1.3TB. This apparently happened while rsync was copying a large (1.1TB)
file. With EC overhead, the file would've been ~1.5TB of cluster
storage, so it seems this happened while the file was being copied, and
then whatever progress had been made was lost/deleted and the objects
dropped. rsync didn't print any errors, though, and kept on merrily
going throuh other files, but anything from this point onward never made
it to the cluster. It's as if everything rsync did might as well gone to
/dev/null from that point on. I didn't realize this until later, though.

At 9:26 I got an alert that the server was running out of RAM (not
unexpected, with the default BlueStore cache size), so I halved that and
restarted all OSDs, but I don't think this is related. By then the
cluster usage was already low and throughput was 0, so the problem had
already occurred. This was probably just a side-effect of all the object
deletion activity. This was a preemptive monitoring alert; nothing ever
actually OOMed or crashed, I restarted the OSDs and fixed the problem
before anything bad could happen.

At 12:41 I figured out something was wrong and straced the running rsync
and confirmed that it was definitely issuing write() syscalls, but the
cluster saw nothing - top showed only rsync consuming CPU (and maybe
ceph-fuse? I forget, possibly that too). All the work from that point
onwards seemed to just vanish into thin air. I checked the rsync verbose
output against the actual cephfs contents, and everything prior to the
1.1TB file was there, while everything after just wasn't (no files; some
empty directories, but I think rsync pre-creates those ahead of time).

I restarted rsync and it just started right back at that big file. The
cluster throughput is back up again so I assume writes are making it to
the cluster now. I have not restarted/remounted ceph-fuse.

I've checked syslog and all the Ceph log but I cannot find anything
relevant. The only evidence of the problem is increased RocksDB
compaction activity during the 9:13-9:28 interval when all the objects
for the huge file were apparently being removed. No errors, nothing in
syslog or dmesg, nothing in the ceph-fuse log
(/var/log/ceph/ceph-client.cephfs.log) either. There are no weird
cronjobs that might've done something strange. It's like the cephfs fuse
client decided to shadowban the rsync process without leaving any
evidence behind.

Any ideas what might've happened here? If this happens again / is
reproducible I'll try to see if I can do some more debugging...

-- 
Hector Martin (hector@xxxxxxxxxxxxxx)
Public Key: https://mrcn.st/pub
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com