Re: Disks are filling up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



No this is a cephadm setup. Not rook.

In the last days it is still deep scrubbing and filling up. We have to do something about it as it now impacts our K8s cluster (very slow cephfs access) and we are running out of (allocated) diskspace again.

Some more details now that I had a few more days to think about our particular setup: * This is a setup with ESXi/vSphere virtualization. The ceph nodes are just some VMs. We don't have access to the bare servers or even direct access to the HDDs/SSDs ceph runs on. * The setup is "asymmetric": there are 2 nodes on SSDs and one on HDDs (they are all RAIDx with hardware controllers, but we have no say in this). I labeled all OSDs as HDDs (even when VMWare reported SSD). * We looked at the OSDs device usage and it is 100% (from the VMs point of view) for the HDDs (20% on everage for the SSD nodes).

My suspision is:
* deep scrubbing means every new write goes to unallocated space, no more overwrite/deleting while deep scrubbing. I didn't find it in the docs. Maybe I missed it, maybe that is common wisdom among the initiated. * We write more new data per second to cephfs than can be scrubbed so scrubbing never ends and the PGs fill up.

We now ordered SSDs for the HDD only node to prevent this in the future.
Meanwhile we need to do something so we think about moving the data in cephfs to a new PG that does not need deep scrubbing at the moment. Also we think about moving the OSD from the physical host that only has HDDs to one with SSDs ruining redundancy for a short while and hoping for the best

Am 26.04.2023 um 02:28 schrieb A Asraoui:

Omar, glad to see cephfs with kubernetes up and running.. did you guys use rook to deploy this ??

Abdelillah
On Mon, Apr 24, 2023 at 6:56 AM Omar Siam <Omar.Siam@xxxxxxxxxx> wrote:

    Hi list,

    we created a cluster for using cephfs with a kubernetes cluster.
    Since a
    few weeks now the cluster keeps filling up at an alarming rate
    (100 GB per day).
    This is while the most relevant pg is deep scrubbing and was
    interupted
    a few times.

    We use about 150G (du using the mounted filesystem) on the cephfs
    filesystem and try not to use snapshots (.snap directories "exist"
    but
    are empty).
    We do not understand why the pgs get bigger and bigger while cephfs
    stays about the same size (overwrites on files certainly happen).
    I suspect some snapshots mechanism. Any ideas how to debug this to
    stop it?

    Maybe we should try to speed up the deep scrubbing somehow?

Best regards

--
Mag. Ing. Omar Siam
Austrian Center for Digital Humanities and Cultural Heritage
Österreichische Akademie der Wissenschaften | Austrian Academy of Sciences
Stellvertretende Behindertenvertrauensperson | Deputy representative for disabled persons
Bäckerstraße 13, 1010 Wien, Österreich | Vienna, Austria
T: +43 1 51581-7295
omar.siam@xxxxxxxxxx  |www.oeaw.ac.at/acdh
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux