I'm at a loss as to what happened here. I'm testing a single-node Ceph "cluster" as a replacement for RAID and traditional filesystems. 9 4TB HDDs, one single (underpowered) server. Running Luminous 12.2.5 with BlueStore OSDs. I set up CephFS on a k=6,m=2 EC pool, mounted it via FUSE, and ran an rsync from a traditional FS storing a few terabytes of data, including some huge (>1TB) files. It all was going well, but then this happened: https://mrcn.st/t/cephfs_dataloss.png At around 9:13, throughput crashed to 0 and cluster usage dropped by ~1.3TB. This apparently happened while rsync was copying a large (1.1TB) file. With EC overhead, the file would've been ~1.5TB of cluster storage, so it seems this happened while the file was being copied, and then whatever progress had been made was lost/deleted and the objects dropped. rsync didn't print any errors, though, and kept on merrily going throuh other files, but anything from this point onward never made it to the cluster. It's as if everything rsync did might as well gone to /dev/null from that point on. I didn't realize this until later, though. At 9:26 I got an alert that the server was running out of RAM (not unexpected, with the default BlueStore cache size), so I halved that and restarted all OSDs, but I don't think this is related. By then the cluster usage was already low and throughput was 0, so the problem had already occurred. This was probably just a side-effect of all the object deletion activity. This was a preemptive monitoring alert; nothing ever actually OOMed or crashed, I restarted the OSDs and fixed the problem before anything bad could happen. At 12:41 I figured out something was wrong and straced the running rsync and confirmed that it was definitely issuing write() syscalls, but the cluster saw nothing - top showed only rsync consuming CPU (and maybe ceph-fuse? I forget, possibly that too). All the work from that point onwards seemed to just vanish into thin air. I checked the rsync verbose output against the actual cephfs contents, and everything prior to the 1.1TB file was there, while everything after just wasn't (no files; some empty directories, but I think rsync pre-creates those ahead of time). I restarted rsync and it just started right back at that big file. The cluster throughput is back up again so I assume writes are making it to the cluster now. I have not restarted/remounted ceph-fuse. I've checked syslog and all the Ceph log but I cannot find anything relevant. The only evidence of the problem is increased RocksDB compaction activity during the 9:13-9:28 interval when all the objects for the huge file were apparently being removed. No errors, nothing in syslog or dmesg, nothing in the ceph-fuse log (/var/log/ceph/ceph-client.cephfs.log) either. There are no weird cronjobs that might've done something strange. It's like the cephfs fuse client decided to shadowban the rsync process without leaving any evidence behind. Any ideas what might've happened here? If this happens again / is reproducible I'll try to see if I can do some more debugging... -- Hector Martin (hector@xxxxxxxxxxxxxx) Public Key: https://mrcn.st/pub _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com