Re: CephFS space usage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Thorne,

Here is one more suggestion on how to debug this. Right now, there is
uncertainty on whether there is really a disk space leak or if
something simply wrote new data during the test.

If you have at least three OSDs you can reassign, please set their
CRUSH device class to something different than before. E.g., "test".
Then, create a new pool that targets this device class and add it to
CephFS. Then, create an empty directory on CephFS and assign this pool
to it using setfattr. Finally, try reproducing the issue using only
files in this directory. This way, you will be sure that nobody else
is writing any data to the new pool.

On Tue, Mar 19, 2024 at 5:40 PM Igor Fedotov <igor.fedotov@xxxxxxxx> wrote:
>
> Hi Thorn,
>
> given the amount of files at CephFS volume I presume you don't have
> severe write load against it. Is that correct?
>
> If so we can assume that the numbers you're sharing are mostly refer to
> your experiment. At peak I can see bytes_used increase = 629,461,893,120
> bytes (45978612027392  - 45349150134272). With replica factor = 3 this
> roughly matches your written data (200GB I presume?).
>
>
> More interestingly is that after file's removal we can see 419,450,880
> bytes delta (=45349569585152 - 45349150134272). I could see two options
> (apart that someone else wrote additional stuff to CephFS during the
> experiment) to explain this:
>
> 1. File removal wasn't completed at the last probe half an hour after
> file's removal. Did you see stale object counter when making that probe?
>
> 2. Some space is leaking. If that's the case this could be a reason for
> your issue if huge(?) files at CephFS are created/removed periodically.
> So if we're certain that the leak really occurred (and option 1. above
> isn't the case) it makes sense to run more experiments with
> writing/removing a bunch of huge files to the volume to confirm space
> leakage.
>
> On 3/18/2024 3:12 AM, Thorne Lawler wrote:
> >
> > Thanks Igor,
> >
> > I have tried that, and the number of objects and bytes_used took a
> > long time to drop, but they seem to have dropped back to almost the
> > original level:
> >
> >   * Before creating the file:
> >       o 3885835 objects
> >       o 45349150134272 bytes_used
> >   * After creating the file:
> >       o 3931663 objects
> >       o 45924147249152 bytes_used
> >   * Immediately after deleting the file:
> >       o 3935995 objects
> >       o 45978612027392 bytes_used
> >   * Half an hour after deleting the file:
> >       o 3886013 objects
> >       o 45349569585152 bytes_used
> >
> > Unfortunately, this is all production infrastructure, so there is
> > always other activity taking place.
> >
> > What tools are there to visually inspect the object map and see how it
> > relates to the filesystem?
> >
> Not sure if there is anything like that at CephFS level but you can use
> rados tool to view objects in cephfs data pool and try to build some
> mapping between them and CephFS file list. Could be a bit tricky though.
> >
> > On 15/03/2024 7:18 pm, Igor Fedotov wrote:
> >> ceph df detail --format json-pretty
> > --
> >
> > Regards,
> >
> > Thorne Lawler - Senior System Administrator
> > *DDNS* | ABN 76 088 607 265
> > First registrar certified ISO 27001-2013 Data Security Standard ITGOV40172
> > P +61 499 449 170
> >
> > _DDNS
> >
> > /_*Please note:* The information contained in this email message and
> > any attached files may be confidential information, and may also be
> > the subject of legal professional privilege. _If you are not the
> > intended recipient any use, disclosure or copying of this email is
> > unauthorised. _If you received this email in error, please notify
> > Discount Domain Name Services Pty Ltd on 03 9815 6868 to report this
> > matter and delete all copies of this transmission together with any
> > attachments. /
> >
> --
> Igor Fedotov
> Ceph Lead Developer
>
> Looking for help with your Ceph cluster? Contact us athttps://croit.io
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
> Web:https://croit.io  | YouTube:https://goo.gl/PGE1Bx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx



-- 
Alexander E. Patrakov
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux