Forever growing data in ceph using RBD image

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,
I would like to know if there is something planned to correct the "forever growing" effet when using rbd image. My experience shows that the replicas of a rbd images are never discarded and never overwriten. Lets say my physical share is about 30 TB I make an image of 13TB (half the real space - 25% of disfunction osd support). My experience shows that the rbd image is overwriten so if I top the 13TB once i get a 26TB of real space used (replicas set to 2) if I delete 8TB from those 13TB I see the real space used unchanged. If I write back 4TB then ceph collapse it is nearfull and I have to go buy another 30TB integrate it to my cluster to hold the problem. But still soon I have in my ceph more useless replicas of "delete" datas than usefull data with they replicas.

Usually when I talk to dev team about this problem they tell me that the real problem is the lack of trim in XFS, but my own analysis shows that the real problem is ceph internal way to handle data. It is ceph that never discard any replicas and never "clean" itself to only keep records of the data in use.

If ceph was behaving properly then for a replicas set to 2 I would have my rbd image of 13 TB the 13TB replicas corresponding, and a fix 26TB of overall used data. When I would "free" data in the rbd image the corresponding replicas would be considered as discarded by ceph and when the real data in the rbd image is overwriten their corresponding replicas would be overwriten too with the new data. That would show the overall data space used as fixed.

In case of the failure of 2 osd then the ceph system would have just enough space to clone the replicas of the missing data which actually is
not the case in a environement that reach near full state.



So my ask is what is planned to correct this problem?

Best regards,

--
Alphe Salas
I.T ingeneer
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux