Re: Ceph cluster uses substantially more disk space after rebalancing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi again.

It seems I've found the problem, although I don't understand the root cause.

I looked into OSD datastore using ceph-objectstore-tool and I see that for almost every object there are two copies, like:

2#13:080008d8:::rbd_data.15.3d3e1d6b8b4567.0000000000361a96:28#
2#13:080008d8:::rbd_data.15.3d3e1d6b8b4567.0000000000361a96:head#

And more interesting is the fact that these two copies don't differ (!).

So the space is taken up by the unneeded snapshot copies.

rbd_data.15.3d3e1d6b8b4567 is the prefix of the biggest (14 TB) base image we have. This image has 1 snapshot:

[root@sill-01 ~]# rbd info rpool_hdd/rms-201807-golden
rbd image 'rms-201807-golden':
        size 14 TiB in 3670016 objects
        order 22 (4 MiB objects)
        id: 3d3e1d6b8b4567
        data_pool: ecpool_hdd
        block_name_prefix: rbd_data.15.3d3e1d6b8b4567
        format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, data-pool
        op_features:
        flags:
        create_timestamp: Tue Aug  7 13:00:10 2018
[root@sill-01 ~]# rbd snap ls rpool_hdd/rms-201807-golden
SNAPID NAME      SIZE TIMESTAMP
    37 initial 14 TiB Tue Aug 14 12:42:48 2018

The problem is this image has NEVER been written to after importing it to Ceph with RBD. All writes go only to its clones.

So I have 2.. no, 5 questions:

1) Why base image snapshot is "provisioned" while the image isn't written to? May it be related to `rbd snap revert`? (i.e. does rbd snap revert just copy all snapshot data into the image itself?)

2) If all parent snapshots seem to be forcefully provisioned on write: Is there a way to disable this behaviour? Maybe if I make the base image readonly its snapshots will stop to be "provisioned"?

3) Even if there is no way to disable it: why does Ceph create extra copy of equal snapshot data during rebalance?

4) What's ":28" in rados objects? Snapshot id is 37. Even in hex 0x28 = 40, not 37. Or does RADOS snapshot id not need to be equal to RBD snapshot ID?

5) Am I safe to "unprovision" the snapshot? (for example, by doing `rbd snap revert`?)
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux