Re: Removing secondary data pool from mds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Frank,

We're not using snapshots.

I was able to run:
    ceph daemon mds.ceph1 dump cache /tmp/cache.txt

...and scan for the stray object to find the cap id that was accessing the object. I matched this with the entity name in:
    ceph daemon mds.ceph1 session ls

...to determine the client host. The strays went away after I rebooted the offending client.

With all access to the objects now cleared, I ran:

    ceph pg X.Y mark_unfound_lost delete

...on any remaining rados objects.

At this point (at long last) the pool was able to return to the 'HEALTHY' status. However, there is one remaining bit that I don't understand. 'ceph df' returns 355 objects for the pool (fs.data.archive.frames):

https://pastebin.com/vbZLhQmC

...but 'rados -p fs.data.archive.frames ls --all' returns no objects. So I'm not sure what these 355 objects were. Because of that, I haven't removed the pool from cephfs quite yet, even though I think it would be safe to do so.

--Mike


On 2/10/21 4:20 PM, Frank Schilder wrote:
Hi Michael,

out of curiosity, did the pool go away or did it put up a fight?

I don't remember exactly, its a long time ago, but I believe stray objects on fs pools come from files still in snapshots but were deleted on the fs level. Such files are moved to special stray pools until the snapshot containing them is deleted as well. Not sure if this applies here though, there might be other occasions when objects go to stray.

I updated the case concerning the underlying problem, but not too much progress either: https://tracker.ceph.com/issues/46847#change-184710 . I had PG degradation even using the recovery technique with before- and after crush maps. I was just lucky that I lost only 1 shard per object and ordinary recovery could fix it.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Michael Thomas <wart@xxxxxxxxxxx>
Sent: 21 December 2020 23:12:09
To: ceph-users@xxxxxxx
Subject:  Removing secondary data pool from mds

I have a cephfs secondary (non-root) data pool with unfound and degraded
objects that I have not been able to recover[1].  I created an
additional data pool and used "setfattr -n ceph.dir.layout.pool' and a
very long rsync to move the files off of the degraded pool and onto the
new pool.  This has completed, and using find + 'getfattr -n
ceph.file.layout.pool', I verified that no files are using the old pool
anymore.  No ceph.dir.layout.pool attributes point to the old pool either.

However, the old pool still reports that there are objects in the old
pool, likely the same ones that were unfound/degraded from before:
https://pastebin.com/qzVA7eZr

Based on a old message from the mailing list[2], I checked the MDS for
stray objects (ceph daemon mds.ceph4 dump cache file.txt ; grep -i stray
file.txt) and found 36 stray entries in the cache:
https://pastebin.com/MHkpw3DV.  However, I'm not certain how to map
these stray cache objects to clients that may be accessing them.

'rados -p fs.data.archive.frames ls' shows 145 objects.  Looking at the
parent of each object shows 2 strays:

for obj in $(cat rados.ls.txt) ; do echo $obj ; rados -p
fs.data.archive.frames getxattr $obj parent | strings ; done


[...]
10000020fa1.00000000
10000020fa1
stray6
10000020fbc.00000000
10000020fbc
stray6
[...]

...before getting stuck on one object for over 5 minutes (then I gave up):

1000005b1af.00000083

What can I do to make sure this pool is ready to be safely deleted from
cephfs (ceph fs rm_data_pool archive fs.data.archive.frames)?

--Mike

[1]https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/QHFOGEKXK7VDNNSKR74BA6IIMGGIXBXA/#7YQ6SSTESM5LTFVLQK3FSYFW5FDXJ5CF

[2]http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-October/005233.html
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux