Re: CephFS dropping data with rsync?

"Yan, Zheng" <ukernel@xxxxxxxxx> · Tue, 19 Jun 2018 11:34:11 +0800

On Sat, Jun 16, 2018 at 12:23 PM Hector Martin <hector@xxxxxxxxxxxxxx> wrote:
>
> I'm at a loss as to what happened here.
>
> I'm testing a single-node Ceph "cluster" as a replacement for RAID and
> traditional filesystems. 9 4TB HDDs, one single (underpowered) server.
> Running Luminous 12.2.5 with BlueStore OSDs.
>
> I set up CephFS on a k=6,m=2 EC pool, mounted it via FUSE, and ran an
> rsync from a traditional FS storing a few terabytes of data, including
> some huge (>1TB) files. It all was going well, but then this happened:
>
> https://mrcn.st/t/cephfs_dataloss.png
>
> At around 9:13, throughput crashed to 0 and cluster usage dropped by
> ~1.3TB. This apparently happened while rsync was copying a large (1.1TB)
> file. With EC overhead, the file would've been ~1.5TB of cluster
> storage, so it seems this happened while the file was being copied, and
> then whatever progress had been made was lost/deleted and the objects
> dropped. rsync didn't print any errors, though, and kept on merrily
> going throuh other files, but anything from this point onward never made
> it to the cluster. It's as if everything rsync did might as well gone to
> /dev/null from that point on. I didn't realize this until later, though.
>
> At 9:26 I got an alert that the server was running out of RAM (not
> unexpected, with the default BlueStore cache size), so I halved that and
> restarted all OSDs, but I don't think this is related. By then the
> cluster usage was already low and throughput was 0, so the problem had
> already occurred. This was probably just a side-effect of all the object
> deletion activity. This was a preemptive monitoring alert; nothing ever
> actually OOMed or crashed, I restarted the OSDs and fixed the problem
> before anything bad could happen.
>
> At 12:41 I figured out something was wrong and straced the running rsync
> and confirmed that it was definitely issuing write() syscalls, but the
> cluster saw nothing - top showed only rsync consuming CPU (and maybe
> ceph-fuse? I forget, possibly that too). All the work from that point
> onwards seemed to just vanish into thin air. I checked the rsync verbose
> output against the actual cephfs contents, and everything prior to the
> 1.1TB file was there, while everything after just wasn't (no files; some
> empty directories, but I think rsync pre-creates those ahead of time).
>
> I restarted rsync and it just started right back at that big file. The
> cluster throughput is back up again so I assume writes are making it to
> the cluster now. I have not restarted/remounted ceph-fuse.
>
> I've checked syslog and all the Ceph log but I cannot find anything
> relevant. The only evidence of the problem is increased RocksDB
> compaction activity during the 9:13-9:28 interval when all the objects
> for the huge file were apparently being removed. No errors, nothing in
> syslog or dmesg, nothing in the ceph-fuse log
> (/var/log/ceph/ceph-client.cephfs.log) either. There are no weird
> cronjobs that might've done something strange. It's like the cephfs fuse
> client decided to shadowban the rsync process without leaving any
> evidence behind.
>
> Any ideas what might've happened here? If this happens again / is
> reproducible I'll try to see if I can do some more debugging...
>

which version of kernel? did you run ceph-fuse and ceph-osd on the same machine?

> --
> Hector Martin (hector@xxxxxxxxxxxxxx)
> Public Key: https://mrcn.st/pub
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com