Re: CephFS space usage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alexander,

Thank you, but as I said to Igor: The 5.5TB of files on this filesystem are virtual machine disks. They are under constant, heavy write load. There is no way to turn this off.

On 19/03/2024 9:36 pm, Alexander E. Patrakov wrote:
Hello Thorne,

Here is one more suggestion on how to debug this. Right now, there is
uncertainty on whether there is really a disk space leak or if
something simply wrote new data during the test.

If you have at least three OSDs you can reassign, please set their
CRUSH device class to something different than before. E.g., "test".
Then, create a new pool that targets this device class and add it to
CephFS. Then, create an empty directory on CephFS and assign this pool
to it using setfattr. Finally, try reproducing the issue using only
files in this directory. This way, you will be sure that nobody else
is writing any data to the new pool.

On Tue, Mar 19, 2024 at 5:40 PM Igor Fedotov<igor.fedotov@xxxxxxxx> wrote:
Hi Thorn,

given the amount of files at CephFS volume I presume you don't have
severe write load against it. Is that correct?

If so we can assume that the numbers you're sharing are mostly refer to
your experiment. At peak I can see bytes_used increase = 629,461,893,120
bytes (45978612027392  - 45349150134272). With replica factor = 3 this
roughly matches your written data (200GB I presume?).


More interestingly is that after file's removal we can see 419,450,880
bytes delta (=45349569585152 - 45349150134272). I could see two options
(apart that someone else wrote additional stuff to CephFS during the
experiment) to explain this:

1. File removal wasn't completed at the last probe half an hour after
file's removal. Did you see stale object counter when making that probe?

2. Some space is leaking. If that's the case this could be a reason for
your issue if huge(?) files at CephFS are created/removed periodically.
So if we're certain that the leak really occurred (and option 1. above
isn't the case) it makes sense to run more experiments with
writing/removing a bunch of huge files to the volume to confirm space
leakage.

On 3/18/2024 3:12 AM, Thorne Lawler wrote:
Thanks Igor,

I have tried that, and the number of objects and bytes_used took a
long time to drop, but they seem to have dropped back to almost the
original level:

   * Before creating the file:
       o 3885835 objects
       o 45349150134272 bytes_used
   * After creating the file:
       o 3931663 objects
       o 45924147249152 bytes_used
   * Immediately after deleting the file:
       o 3935995 objects
       o 45978612027392 bytes_used
   * Half an hour after deleting the file:
       o 3886013 objects
       o 45349569585152 bytes_used

Unfortunately, this is all production infrastructure, so there is
always other activity taking place.

What tools are there to visually inspect the object map and see how it
relates to the filesystem?

Not sure if there is anything like that at CephFS level but you can use
rados tool to view objects in cephfs data pool and try to build some
mapping between them and CephFS file list. Could be a bit tricky though.
On 15/03/2024 7:18 pm, Igor Fedotov wrote:
ceph df detail --format json-pretty
--

Regards,

Thorne Lawler - Senior System Administrator
*DDNS* | ABN 76 088 607 265
First registrar certified ISO 27001-2013 Data Security Standard ITGOV40172
P +61 499 449 170

_DDNS

/_*Please note:* The information contained in this email message and
any attached files may be confidential information, and may also be
the subject of legal professional privilege. _If you are not the
intended recipient any use, disclosure or copying of this email is
unauthorised. _If you received this email in error, please notify
Discount Domain Name Services Pty Ltd on 03 9815 6868 to report this
matter and delete all copies of this transmission together with any
attachments. /

--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us athttps://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web:https://croit.io  | YouTube:https://goo.gl/PGE1Bx
_______________________________________________
ceph-users mailing list --ceph-users@xxxxxxx
To unsubscribe send an email toceph-users-leave@xxxxxxx


--

Regards,

Thorne Lawler - Senior System Administrator
*DDNS* | ABN 76 088 607 265
First registrar certified ISO 27001-2013 Data Security Standard ITGOV40172
P +61 499 449 170

_DDNS

/_*Please note:* The information contained in this email message and any attached files may be confidential information, and may also be the subject of legal professional privilege. _If you are not the intended recipient any use, disclosure or copying of this email is unauthorised. _If you received this email in error, please notify Discount Domain Name Services Pty Ltd on 03 9815 6868 to report this matter and delete all copies of this transmission together with any attachments. /
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux