Re: Snapshot trimming

David Turner <drakonstein@xxxxxxxxx> · Mon, 29 Jan 2018 12:57:57 +0000

I don't know why you keep asking the same question about snap trimming. You haven't shown any evidence that your cluster is behind on that. Have you looked into fstrim inside of your VMs?

On Mon, Jan 29, 2018, 4:30 AM Karun Josy <karunjosy1@xxxxxxxxx> wrote:
fast-diff map is not enabled for RBD images.
Can it be a reason for Trimming not happening ?

Karun Josy

On Sat, Jan 27, 2018 at 10:19 PM, Karun Josy <karunjosy1@xxxxxxxxx> wrote:
Hi David,

Thank you for your reply! I really appreciate it.

The images are in pool id 55. It is an erasure coded pool.

---------------
$ echo $(( $(ceph pg  55.58 query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
0
$ echo $(( $(ceph pg  55.a query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
0
$ echo $(( $(ceph pg  55.65 query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
0
--------------

Current snap_trim_sleep value is default.
"osd_snap_trim_sleep": "0.000000". I assume it means there is no delay. (Can't find any documentation related to it)
Will changing its value initiate snaptrimming, like 
ceph tell osd.* injectargs '--osd_snap_trim_sleep 0.05'

Also, we are using an rbd user with the below profile. It is used while deleting snapshots 
-------
        caps: [mon] profile rbd
        caps: [osd] profile rbd pool=ecpool, profile rbd pool=vm, profile rbd-read-only pool=templates
-------

Can it be a reason ?

Also, can you let me know which all logs to check while deleting snapshots to see if it is snaptrimming ?
I am sorry I feel like pestering you too much. 
But in mailing lists, I can see you have dealt with similar issues with Snapshots
So I think you can help me figure this mess out.

Karun Josy

On Sat, Jan 27, 2018 at 7:15 PM, David Turner <drakonstein@xxxxxxxxx> wrote:
Prove* a positive

On Sat, Jan 27, 2018, 8:45 AM David Turner <drakonstein@xxxxxxxxx> wrote:
Unless you have things in your snap_trimq, your problem isn't snap trimming. That is currently how you can check snap trimming and you say you're caught up.
Are you certain that you are querying the correct pool for the images you are snapshotting. You showed that you tested 4 different pools. You should only need to check the pool with the images you are dealing with.
You can inversely price a positive by changing your snap_trim settings to not do any cleanup and see if the appropriate PGs have anything in their q.

On Sat, Jan 27, 2018, 12:06 AM Karun Josy <karunjosy1@xxxxxxxxx> wrote:
Is scrubbing and deep scrubbing necessary for Snaptrim operation to happen ?

Karun Josy

On Fri, Jan 26, 2018 at 9:29 PM, Karun Josy <karunjosy1@xxxxxxxxx> wrote:
Thank you for your quick response!
I used the command to fetch the snap_trimq from many pgs, however it seems they don't have any in queue ?

For eg : 
====================
$ echo $(( $(ceph pg  55.4a query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
0
$ echo $(( $(ceph pg  55.5a query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
0
$ echo $(( $(ceph pg  55.88 query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
0
$ echo $(( $(ceph pg  55.55 query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
0
$ echo $(( $(ceph pg  54.a query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
0
$ echo $(( $(ceph pg  34.1d query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
0
$ echo $(( $(ceph pg  1.3f query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
0
=====================

While going through the PG query, I find that these PGs have no value in purged_snaps section too. 
For eg : 
ceph pg  55.80 query
--
---
---
 {
            "peer": "83(3)",
            "pgid": "55.80s3",
            "last_update": "43360'15121927",
            "last_complete": "43345'15073146",
            "log_tail": "43335'15064480",
            "last_user_version": 15066124,
            "last_backfill": "MAX",
            "last_backfill_bitwise": 1,
            "purged_snaps": [],
            "history": {
                "epoch_created": 5950,                "epoch_pool_created": 5950,
                "last_epoch_started": 43339,
                "last_interval_started": 43338,
                "last_epoch_clean": 43340,
                "last_interval_clean": 43338,
                "last_epoch_split": 0,
                "last_epoch_marked_full": 42032,
                "same_up_since": 43338,
                "same_interval_since": 43338,
                "same_primary_since": 43276,
                "last_scrub": "35299'13072533",
                "last_scrub_stamp": "2018-01-18 14:01:19.557972",
                "last_deep_scrub": "31372'12176860",
                "last_deep_scrub_stamp": "2018-01-15 12:21:17.025305",
                "last_clean_scrub_stamp": "2018-01-18 14:01:19.557972"
            },

Not sure if it is related.

The cluster is not open to any new clients. However we see a steady growth of  space usage every day.
And worst case scenario, it might grow faster than we can add more space, which will be dangerous. 

Any help is really appreciated. 

Karun Josy

On Fri, Jan 26, 2018 at 8:23 PM, David Turner <drakonstein@xxxxxxxxx> wrote:
"snap_trimq": "[]",
That is exactly what you're looking for to see how many objects a PG still had that need to be cleaned up. I think something like this should give you the number of objects in the snap_trimq for a PG.
echo $(( $(ceph pg $pg query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
Note, I'm not at a computer and topping this from my phone so it's not pretty and I know of a few ways to do that better, but that should work all the same.
For your needs a visual inspection of several PGs should be sufficient to see if there is anything in the snap_trimq to begin with.

On Fri, Jan 26, 2018, 9:18 AM Karun Josy <karunjosy1@xxxxxxxxx> wrote:
 Hi David,

Thank you for the response. To be honest, I am afraid it is going to be a issue in our cluster.
It seems snaptrim has not been going on for sometime now , maybe because we were expanding the cluster adding nodes for the past few weeks.

I would be really glad if you can guide me how to overcome this.
Cluster has about 30TB data and 11 million objects. With about 100 disks spread across 16 nodes. Version is 12.2.2
Searching through the mailing lists I can see many cases where the performance were affected while snaptrimming. 

Can you help me figure out these :

- How to find snaptrim queue of a PG.
- Can snaptrim be started just on 1 PG
- How can I make sure cluster IO performance is not affected ?
I read about osd_snap_trim_sleep , how can it be changed ?
Is this the command : ceph tell osd.* injectargs '--osd_snap_trim_sleep 0.005'

If yes what is the recommended value that we can use ?

Also, what all parameters should we be concerned about? I would really appreciate any suggestions.

Below is a brief extract of a PG queried 
----------------------------
ceph pg  55.77 query
{
    "state": "active+clean",
    "snap_trimq": "[]",
---
----

"pgid": "55.77s7",
            "last_update": "43353'17222404",
            "last_complete": "42773'16814984",
            "log_tail": "42763'16812644",
            "last_user_version": 16814144,
            "last_backfill": "MAX",
            "last_backfill_bitwise": 1,
            "purged_snaps": [],
            "history": {
                "epoch_created": 5950,
---
---
---

Karun Josy

On Fri, Jan 26, 2018 at 6:36 PM, David Turner <drakonstein@xxxxxxxxx> wrote:
You may find the information in this ML thread useful.  https://www.spinics.net/lists/ceph-users/msg41279.html
It talks about a couple ways to track your snaptrim queue.

On Fri, Jan 26, 2018 at 2:09 AM Karun Josy <karunjosy1@xxxxxxxxx> wrote:
Hi,

We have set no scrub , no deep scrub flag on a ceph cluster.
When we are deleting snapshots we are not seeing any change in usage space.

I understand that Ceph OSDs delete data asynchronously, so deleting a snapshot doesn’t free up the disk space immediately. But we are not seeing any change for sometime.

What can be possible reason ? Any suggestions would be really helpful as the cluster size seems to be growing each day even though snapshots are deleted. 

Karun 

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com