errata: con-fs2-meta2 is the default data pool. ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Frank Schilder Sent: 03 February 2020 10:08 To: Patrick Donnelly; Konstantin Shalygin Cc: ceph-users Subject: Re: ceph fs dir-layouts and sub-directory mounts Dear Konstantin and Patrick, thanks! I started migrating a 2-pool layout ceph fs (rep meta, EC default data) to a 3-pool layout (rep meta, rep default data, EC data set at "/") and use sub-directory mounts for data migration. So far, everything as it should. Maybe some background info for everyone who is reading this. The reason for migrating is the modified best practices for cephfs, compare these two: https://docs.ceph.com/docs/mimic/cephfs/createfs/#creating-pools https://docs.ceph.com/docs/master/cephfs/createfs/#creating-pools The 3-pool layout was never mentioned in the RH ceph-course I took, nor by any of the ceph consultants we hired before deploying ceph. However, it seems really important to know about it. For a meta data + data pool layout, since some meta-data is written to the default data pool, an EC default data pool seems a bad idea most of the time. I see a lot of size-0 objects that only store rados meta data: POOLS: NAME ID USED %USED MAX AVAIL OBJECTS con-fs2-meta1 12 256 MiB 0.02 1.1 TiB 410910 con-fs2-meta2 13 0 B 0 355 TiB 5217644 con-fs2-data 14 50 TiB 5.53 852 TiB 17943209 con-fs2-meta2 is the default data pool. This is probably the worst workload for an EC pool. On our file system I have regularly seen "one MDS reports slow meta-data IOs" and was always wondering where this comes from. I have the meta-data pool on SSDs and this warning simply didn't make any sense. Now I know. Having a small replicated default pool resolves not only this issue, it also speeds up file create/delete and hard-link operations dramatically. I guess, anything that modifies an inode. I never tested these operations in my benchmarks, but they are important. Compiling and installing packages etc., anything with heavy create/modify/delete workload will profit as well as cluster health. Fortunately, I had an opportunity to migrate the ceph fs. For anyone who starts new, I would recommend to have the 3-pool layout right from the beginning. Never use an EC pool as the default data pool. I would even make this statement a bit stronger in the ceph documentation: If erasure-coded pools are planned for the file system, it is usually better to use a replicated pool for the default data pool ... to, for example, If erasure-coded pools are planned for the file system, it is strongly recommended to use a replicated pool for the default data pool ... Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx