Hello, We recently upgraded to Quincy (17.2.7) and I can see in the ceph logs many messages of the form: 1713256584.3135679 osd.28 (osd.28) 66398 : cluster 4 osd.28 found snap mapper error on pg 7.284 oid 7:214b503b:::100125de9b8.00000000:5c snaps in mapper: {}, oi: {5a} ...repaired 1713256584.3136106 osd.28 (osd.28) 66399 : cluster 4 osd.28 found snap mapper error on pg 7.284 oid 7:214b4f95:::1001654390d.00000000:5c snaps in mapper: {}, oi: {5a} ...repaired 1713256584.3136535 osd.28 (osd.28) 66400 : cluster 4 osd.28 found snap mapper error on pg 7.284 oid 7:214b4f3f:::1001549ed54.00000000:5c snaps in mapper: {}, oi: {5a} ...repaired 1713256584.9496887 osd.29 (osd.29) 70001 : cluster 4 osd.29 found snap mapper error on pg 7.b4 oid 7:2d089bdc:::10016105140.00000000:5c snaps in mapper: {}, oi: {5a} ...repaired 1713256590.9785151 osd.28 (osd.28) 66401 : cluster 4 osd.28 found snap mapper error on pg 7.284 oid 7:214b5179:::100128b85a0.00000cfe:5c snaps in mapper: {}, oi: {5a} ...repaired 1713256598.6286905 osd.29 (osd.29) 70002 : cluster 4 osd.29 found snap mapper error on pg 7.17c oid 7:3e877f95:::100151d8670.00000000:5c snaps in mapper: {}, oi: {5a} ...repaired ... A cursory reading of the code involved suggests that the scrubber in Quincy has acquired the capacity of detecting and removing the lost snapshots from Octopus, if I understand it correctly. Cheers, Linkriver Technology On Sat, 2022-06-25 at 19:36 +0000, Kári Bertilsson wrote: > Hello > > I am also having this issue after having > set osd_pg_max_concurrent_snap_trims = 0 previously to pause the > snaptrim. I upgraded to ceph 17.2.0. Have tried restarting, > repeering, deep-scrubbing all OSD's, so far nothing works. > > For one of the affected pools `cephfs_10k` I have tested removing > ALL data and it's still showing 26% usage. All snapshots have been > deleted and all pg's for the pool remain at SNAPTRIMQ_LEN = 0. All > pg's are active+clean. > > The pool still shows 589k object usage. When testing `rados get > object` on all the objects, it only works for 2.420 of them. The rest > seem to be in some kind of limbo and can not be read or deleted using > rados. > > # rados -p cephfs_10k listsnaps 10010539c22.00000000 > > > 10010539c22.00000000: > cloneid snaps size overlap > 288 288 30767656 [] > > # rados -p cephfs_10k get 10010539c22.00000000 10010539c22.00000000 > error getting cephfs_10k/10010539c22.00000000: (2) No such file or > directory > > # rados -p cephfs_10k rm 10010539c22.00000000 > error removing cephfs_10k>10010539c22.00000000: (2) No such file or > directory > > Is there some way to make the snap trimmer rediscover these objects > and remove them ? > > On Fri, Mar 18, 2022 at 2:21 PM Linkriver Technology > <technology@xxxxxxxxxxxxxxxxxxxxx> wrote: > > Hello, > > > > If I understand my issue correctly, it is in fact unrelated to > > CephFS itself, > > rather the problem happens at a lower level (in Ceph itself). IOW, > > it affects > > all kind of snapshots, not just CephFS ones. I believe my FS is > > healthy > > otherwise. In any case, here is the output of the command you > > asked: > > > > I ran it a few hours ago: > > > > "num_strays": 235, > > "num_strays_delayed": 38, > > "num_strays_enqueuing": 0, > > "strays_created": 5414436, > > "strays_enqueued": 5405983, > > "strays_reintegrated": 17892, > > "strays_migrated": 0, > > > > And just now: > > > > "num_strays": 186, > > "num_strays_delayed": 0, > > "num_strays_enqueuing": 0, > > "strays_created": 5540016, > > "strays_enqueued": 5531494, > > "strays_reintegrated": 18128, > > "strays_migrated": 0, > > > > > > Regards, > > > > LRT > > > > -----Original Message----- > > From: Arnaud M <arnaud.meauzoone@xxxxxxxxx> > > To: Linkriver Technology <technology@xxxxxxxxxxxxxxxxxxxxx> > > Cc: Dan van der Ster <dvanders@xxxxxxxxx>, Ceph Users > > <ceph-users@xxxxxxx> > > Subject: Re: CephFS snaptrim bug? > > Date: Thu, 17 Mar 2022 21:48:18 +0100 > > > > Hello Linkriver > > > > I might have an issue close to your > > > > Can you tell us if your strays dirs are full ? > > > > What does this command output to you ? > > > > ceph tell mds.0 perf dump | grep strays > > > > Does the value change over time ? > > > > All the best > > > > Arnaud > > > > Le mer. 16 mars 2022 à 15:35, Linkriver Technology < > > technology@xxxxxxxxxxxxxxxxxxxxx> a écrit : > > > > > Hi, > > > > > > Has anyone figured whether those "lost" snaps are rediscoverable > > / > > > trimmable? > > > All pgs in the cluster have been deep scrubbed since my previous > > email and > > > I'm > > > not seeing any of that wasted space being recovered. > > > > > > Regards, > > > > > > LRT > > > > > > -----Original Message----- > > > From: Dan van der Ster <dvanders@xxxxxxxxx> > > > To: technology@xxxxxxxxxxxxxxxxxxxxx > > > Cc: Ceph Users <ceph-users@xxxxxxx>, Neha Ojha <nojha@xxxxxxxxxx> > > > Subject: Re: CephFS snaptrim bug? > > > Date: Thu, 24 Feb 2022 09:48:04 +0100 > > > > > > See https://tracker.ceph.com/issues/54396 > > > > > > I don't know how to tell the osds to rediscover those trimmed > > snaps. > > > Neha does that possible? > > > > > > Cheers, Dan > > > > > > On Thu, Feb 24, 2022 at 9:27 AM Dan van der Ster > > <dvanders@xxxxxxxxx> > > > wrote: > > > > > > > > Hi, > > > > > > > > I had a look at the code -- looks like there's a flaw in the > > logic: > > > > the snaptrim queue is cleared if > > osd_pg_max_concurrent_snap_trims = 0. > > > > > > > > I'll open a tracker and send a PR to restrict > > > > osd_pg_max_concurrent_snap_trims to >= 1. > > > > > > > > Cheers, Dan > > > > > > > > On Wed, Feb 23, 2022 at 9:44 PM Linkriver Technology > > > > <technology@xxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > > > Hello, > > > > > > > > > > I have upgraded our Ceph cluster from Nautilus to Octopus > > (15.2.15) > > > over the > > > > > weekend. The upgrade went well as far as I can tell. > > > > > > > > > > Earlier today, noticing that our CephFS data pool was > > approaching > > > capacity, I > > > > > removed some old CephFS snapshots (taken weekly at the root > > of the > > > filesystem), > > > > > keeping only the most recent one (created today, 2022-02-21). > > As > > > expected, a > > > > > good fraction of the PGs transitioned from active+clean to > > > active+clean+snaptrim > > > > > or active+clean+snaptrim_wait. In previous occasions when I > > removed a > > > snapshot > > > > > it took a few days for snaptrimming to complete. This would > > happen > > > without > > > > > noticeably impacting other workloads, and would also free up > > an > > > appreciable > > > > > amount of disk space. > > > > > > > > > > This time around, after a few hours of snaptrimming, users > > complained > > > of high IO > > > > > latency, and indeed Ceph reported "slow ops" on a number of > > OSDs and > > > on the > > > > > active MDS. I attributed this to the snaptrimming and decided > > to > > > reduce it by > > > > > initially setting osd_pg_max_concurrent_snap_trims to 1, > > which didn't > > > seem to > > > > > help much, so I then set it to 0, which had the surprising > > effect of > > > > > transitioning all PGs back to active+clean (is this > > intended?). I also > > > restarted > > > > > the MDS which seemed to be struggling. IO latency went back > > to normal > > > > > immediately. > > > > > > > > > > Outside of users' working hours, I decided to resume > > snaptrimming by > > > setting > > > > > osd_pg_max_concurrent_snap_trims back to 1. Much to my > > surprise, > > > nothing > > > > > happened. All PGs remained (and still remain at time of > > writing) in > > > the state > > > > > active+clean, even after restarting some of them. This > > definitely seems > > > > > abnormal, as I mentioned earlier, snaptrimming this FS > > previously > > > would take in > > > > > the order of multiple days. Moreover, if snaptrim were truly > > complete, > > > I would > > > > > expect pool usage to have dropped by appreciable amounts (at > > least a > > > dozen > > > > > terabytes), but that doesn't seem to be the case. > > > > > > > > > > A du on the CephFS root gives: > > > > > > > > > > # du -sh /mnt/pve/cephfs > > > > > 31T /mnt/pve/cephfs > > > > > > > > > > But: > > > > > > > > > > # ceph df > > > > > <snip> > > > > > --- POOLS --- > > > > > POOL ID PGS STORED OBJECTS USED > > %USED MAX > > > AVAIL > > > > > cephfs_data 7 512 43 TiB 190.83M 147 TiB > > 93.22 > > > 3.6 TiB > > > > > cephfs_metadata 8 32 89 GiB 694.60k 266 GiB > > 1.32 > > > 6.4 TiB > > > > > <snip> > > > > > > > > > > ceph pg dump reports a SNAPTRIMQ_LEN of 0 on all PGs. > > > > > > > > > > Did CephFS just leak a massive 12 TiB worth of objects...? It > > seems to > > > me that > > > > > the snaptrim operation did not complete at all. > > > > > > > > > > Perhaps relatedly: > > > > > > > > > > # ceph daemon mds.choi dump snaps > > > > > { > > > > > "last_created": 93, > > > > > "last_destroyed": 94, > > > > > "snaps": [ > > > > > { > > > > > "snapid": 93, > > > > > "ino": 1, > > > > > "stamp": "2022-02-21T00:00:01.245459+0800", > > > > > "name": "2022-02-21" > > > > > } > > > > > ] > > > > > } > > > > > > > > > > How can last_destroyed > last_created? The last snapshot to > > have been > > > taken on > > > > > this FS is indeed #93, and the removed snapshots were all > > created on > > > previous > > > > > weeks. > > > > > > > > > > Could someone shed some light please? Assuming that snaptrim > > didn't > > > run to > > > > > completion, how can I manually delete objects from now- > > removed > > > snapshots? I > > > > > believe this is what the Ceph documentation calls a > > "backwards scrub" > > > - but I > > > > > didn't find anything in the Ceph suite that can run such a > > scrub. This > > > pool is > > > > > filling up fast, I'll throw in some more OSDs for the moment > > to buy > > > some time, > > > > > but I certainly would appreciate your help! > > > > > > > > > > Happy to attach any logs or info you deem necessary. > > > > > > > > > > Regards, > > > > > > > > > > LRT > > > > > _______________________________________________ > > > > > ceph-users mailing list -- ceph-users@xxxxxxx > > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > > > ceph-users mailing list -- ceph-users@xxxxxxx > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx