Re: Forever growing data in ceph using RBD image

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 17 Jul 2014, Alphe Salas wrote:
> Hello,
> I would like to know if there is something planned to correct the "forever
> growing" effet when using rbd image.
> My experience shows that the replicas of a rbd images are never discarded and
> never overwriten. Lets say my physical share is about 30 TB I make an image of
> 13TB (half the real space - 25% of disfunction osd support). My experience
> shows that the rbd image is overwriten so if I top the 13TB once i get a 26TB
> of real space used (replicas set to 2) if I delete 8TB from those 13TB I see
> the real space used unchanged.
> If I write back 4TB then ceph collapse it is nearfull and I have to go buy
> another 30TB integrate it to my cluster to hold the problem. But still soon I
> have in my ceph more useless replicas of "delete" datas than usefull data with
> they replicas.
> 
> Usually when I talk to dev team about this problem they tell me that the 
> real problem is the lack of trim in XFS, but my own analysis shows that 
> the real problem is ceph internal way to handle data. It is ceph that 
> never discard any replicas and never "clean" itself to only keep records 
> of the data in use.

You are correct that if XFS (or whatever FS you are using) does not issue 
discard/trim, then deleting data inside the fs on top of RBD won't free 
any space.  Note that you usually have to explicitly enable this via a 
mount option; most (all?) kernels still leave this off by default.

Are you taking RBD snapshots?  If not, then there will never be more than 
the rbd image size * num_replicas space used (ignoring the few % of file 
system overhead for the moment).

If you are taking snapshots, then yes.. you will see more space used until 
the snapshot is deleted because we will keep old copies of objects around.

> If ceph was behaving properly then for a replicas set to 2 I would have 
> my rbd image of 13 TB the 13TB replicas corresponding, and a fix 26TB of 
> overall used data. When I would "free" data in the rbd image the 
> corresponding replicas would be considered as discarded by ceph and when 
> the real data in the rbd image is overwriten their corresponding 
> replicas would be overwriten too with the new data. That would show the 
> overall data space used as fixed.

Both ceph *and* the file system on top of RBD have to be "behaving 
properly".  RBD can't free space until it is told to do so by the file 
system, and by default, most/all do not...

sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux