Dear Cephalopodians, I have to extend my question a bit - in our system with 105,000,000 objects in CephFS (mostly stabilized now after the stress-testing...), I observe the following data distribution for the metadata pool: # ceph osd df | head ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS 0 ssd 0.21829 1.00000 223G 9927M 213G 4.34 0.79 0 1 ssd 0.21829 1.00000 223G 9928M 213G 4.34 0.79 0 2 ssd 0.21819 1.00000 223G 77179M 148G 33.73 6.11 128 3 ssd 0.21819 1.00000 223G 76981M 148G 33.64 6.10 128 osd.0 - osd.3 are all exclusively meant for cephfs-metadata, currently we use 4 replicas with failure domain OSD there. I have reinstalled and reformatted osd.0 and osd.1 about 36 hours ago. All 128 PGs in the metadata pool are backfilling (I have increased osd-max-backfills temporarily to speed things up for those OSDs). However, they only managed to backfill < 10 GB in those 36 hours. I have not touched any other of the default settings concerning backfill or recovery (but these are SSDs, so sleeps should be 0). The backfilling seems not to be limited by CPU, nor network, not disks. "ceph -s" confirms a backfill performance of about 60-100 keys/s. This metadata, as written before, is almost exclusively RocksDB: "bluefs": { "gift_bytes": 0, "reclaim_bytes": 0, "db_total_bytes": 84760592384, "db_used_bytes": 77289488384, is it normal that this kind of backfilling is so horrendously slow? Is there a way to speed it up? Like this, it will take almost two weeks for 77 GB of (meta)data. Right now, the system is still in the testing phase, but we'd of course like to be able to add more MDS's and SSD's later without extensive backfilling periods. Cheers, Oliver Am 25.02.2018 um 19:26 schrieb Oliver Freyermuth: > Dear Cephalopodians, > > as part of our stress test with 100,000,000 objects (all small files) we ended up with > the following usage on the OSDs on which the metadata pool lives: > # ceph osd df | head > ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS > [...] > 2 ssd 0.21819 1.00000 223G 79649M 145G 34.81 6.62 128 > 3 ssd 0.21819 1.00000 223G 79697M 145G 34.83 6.63 128 > > The cephfs-data cluster is mostly empty (5 % usage), but contains 100,000,000 small objects. > > Looking with: > ceph daemon osd.2 perf dump > I get: > "bluefs": { > "gift_bytes": 0, > "reclaim_bytes": 0, > "db_total_bytes": 84760592384, > "db_used_bytes": 78920024064, > "wal_total_bytes": 0, > "wal_used_bytes": 0, > "slow_total_bytes": 0, > "slow_used_bytes": 0, > so it seems this is almost exclusively RocksDB usage. > > Is this expected? > Is there a recommendation on how much MDS storage is needed for a CephFS with 450 TB? > > Cheers, > Oliver > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com