Created a tracker to investigate this further. https://tracker.ceph.com/issues/58376 On Wed, Jan 4, 2023 at 3:18 PM Kotresh Hiremath Ravishankar < khiremat@xxxxxxxxxx> wrote: > Hi Mathias, > > I am glad that you could find it's a client related issue and figured a > way around it. > I too could reproduce the issue locally i.e. when a client which was > initially copying the snapshot still > has access to it even when it's got deleted from the other client. I think > this needs further investigation. > I will raise a tracker for the same and share it here. > > Thanks and Regards, > Kotresh H R > > On Tue, Jan 3, 2023 at 3:23 PM Kuhring, Mathias < > mathias.kuhring@xxxxxxxxxxxxxx> wrote: > >> Trying to exclude clusters and/or clients might have gotten me on the >> right track. It might have been a client issue or actually a snapshot >> retention issue. As it turned out when I tried other routes for the data >> using a different client, the data was not available anymore since the >> snapshot had been trimmed. >> >> We got behind syncing our snapshots a while ago (due to other issues). >> And now we are somewhere in between our weekly (16 weeks) and daily (30 >> days) snapshots. So, I assume before we catch up with daily (<30), there is >> a general risk that snapshots disappear while we are syncing them. >> >> The funny/weird thing is though (and why I didn't catch up on this), the >> particular file (and potentially others) of this trimmed snapshot was >> apparently still available for the client I initially used for the >> transfer. I'm wondering if the client somehow cached the data until the >> snapshot got trimmed. And then just re-tried copying the incompletely >> cached data. >> > >> Continuing with the next available snapshot, mirroring/syncing is now >> catching up again. I expect it might happen again once we catch up to the >> 30-days threshold. If the time point of snapshot trimming falls into the >> syncinc time frame. But then I know to just cancel/skip the current >> snapshot and continue with the next one. Syncing time is short enough to >> get me over the hill then before the next trimming. >> >> Note to myself: Next time something similar things happens, check if >> different clients AND different snapshots or original data behave the same. >> >> On 12/22/2022 4:27 PM, Kuhring, Mathias wrote: >> >> Dear ceph community, >> >> >> >> We have two ceph cluster of equal size, one main and one mirror, both >> using cephadm and on version >> >> ceph version 17.2.1 (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy >> (stable) >> >> >> >> We are stuck with copying a large file (~ 64G) between the CephFS file >> systems of the two clusters. >> >> >> The source path is a snapshot (i.e. something like >> /my/path/.snap/schedule_some-date/…). >> But I don't think that should make any difference. >> >> >> >> First, I was thinking that I need to adapt some rsync parameters to work >> better with bigger files on CephFS. >> >> But when confirming by just copying the file with cp, the transfer get's >> also stuck. >> >> Without any error message, the process just keeps running (rsync or cp). >> >> But the file size on the target doesn't increase anymore at some point >> (almost 85%). >> >> >> >> Main: >> >> -rw------- 1 cockpit-ws printadmin 68360698297 16. Nov 13:40 >> LB22_2764_dragen.bam >> >> >> >> Mirror: >> >> -rw------- 1 root root 58099499008 22. Dez 15:54 LB22_2764_dragen.bam >> >> >> >> Our CephFS file size limit is with 10 TB more than generous. >> And as far as I know from clients there are indeed files in TB ranges on >> the cluster without issues. >> >> >> >> I don't know if this is the file's fault or if this is some issue with >> either of the CephFS' or cluster. >> >> And I don't know how to look and troubleshoot this. >> >> Can anybody give me a tip where I can start looking and debug these kind >> of issues? >> >> >> >> Thank you very much. >> >> >> >> Best Wishes, >> >> Mathias >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto: >> ceph-users-leave@xxxxxxx> >> >> >> -- >> Mathias Kuhring >> >> Dr. rer. nat. >> Bioinformatician >> HPC & Core Unit Bioinformatics >> Berlin Institute of Health at Charité (BIH) >> >> E-Mail: mathias.kuhring@xxxxxxxxxxxxxx<mailto: >> mathias.kuhring@xxxxxxxxxxxxxx> >> Mobile: +49 172 3475576 >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx