Re: XFS on RBD on EC painfully slow

胡玮文 <huww98@xxxxxxxxxxx> · Fri, 28 May 2021 11:03:07 +0000

Hi Reed,

Have you tried just start multiple rsync process simultaneously to transfer different directories? Distributed system like ceph often benefits from more parallelism.

Weiwen Hu

> 在 2021年5月28日，03:54，Reed Dier <reed.dier@xxxxxxxxxxx> 写道：
> 
> Hoping someone may be able to help point out where my bottleneck(s) may be.
> 
> I have an 80TB kRBD image on an EC8:2 pool, with an XFS filesystem on top of that.
> This was not an ideal scenario, rather it was a rescue mission to dump a large, aging raid array before it was too late, so I'm working with the hand I was dealt.
> 
> To further conflate the issues, the main directory structure consists of lots and lots of small file sizes, and deep directories.
> 
> My goal is to try and rsync (or otherwise) data from the RBD to cephfs, but its just unbearably slow and will take ~150 days to transfer ~35TB, which is far from ideal.
> 
>>         15.41G  79%    4.36MB/s    0:56:09 (xfr#23165, ir-chk=4061/27259)
> 
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>           0.17    0.00    1.34   13.23    0.00   85.26
>> 
>> Device            r/s     rMB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wMB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dMB/s   drqm/s  %drqm d_await dareq-sz  aqu-sz  %util
>> rbd0           124.00      0.66     0.00   0.00   17.30     5.48   50.00      0.17     0.00   0.00   31.70     3.49    0.00      0.00     0.00   0.00    0.00     0.00    3.39  96.40
> 
> Rsync progress and iostat (during the rsync) from the rbd to a local ssd, to remove any bottlenecks doubling back to cephfs.
> About 16G in 1h, not exactly blazing, this being 5 of the 7000 directories I'm looking to offload to cephfs.
> 
> Currently running 15.2.11, and the host is Ubuntu 20.04 (5.4.0-72-generic) with a single E5-2620, 64GB of memory, and 4x10GbT bond talking to ceph, iperf proves it out.
> EC8:2, across about 16 hosts, 240 OSDs, with 24 of those being 8TB 7.2k SAS, and the other 216 being 2TB 7.2K SATA. So there are quite a few spindles in play here.
> Only 128 PGs, in this pool, but its the only RBD image in this pool. Autoscaler recommends going to 512, but was hoping to avoid the performance overhead of the PG splits if possible, given perf is bad enough as is.
> 
> Examining the main directory structure it looks like there are 7000 files per directory, about 60% of which are <1MiB, and in all totaling nearly 5GiB per directory.
> 
> My fstab for this is:
>> xfs    _netdev,noatime    0    0
> 
> I tried to increase the read_ahead_kb to 4M from 128K at /sys/block/rbd0/queue/read_ahead_kb to match the object/stripe size of the EC pool, but that doesn't appear to have had much of an impact.
> 
> The only thing I can think of that I could possibly try as a change would be to increase the queue depth in the rbdmap up from 128, so thats my next bullet to fire.
> 
> Attaching xfs_info in case there are any useful nuggets:
>> meta-data=/dev/rbd0              isize=256    agcount=81, agsize=268435455 blks
>>         =                       sectsz=512   attr=2, projid32bit=0
>>         =                       crc=0        finobt=0, sparse=0, rmapbt=0
>>         =                       reflink=0
>> data     =                       bsize=4096   blocks=21483470848, imaxpct=5
>>         =                       sunit=0      swidth=0 blks
>> naming   =version 2              bsize=4096   ascii-ci=0, ftype=0
>> log      =internal log           bsize=4096   blocks=32768, version=2
>>         =                       sectsz=512   sunit=0 blks, lazy-count=0
>> realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> And rbd-info:
>> rbd image 'rbd-image-name:
>>        size 85 TiB in 22282240 objects
>>        order 22 (4 MiB objects)
>>        snapshot_count: 0
>>        id: a09cac2b772af5
>>        data_pool: rbd-ec82-pool
>>        block_name_prefix: rbd_data.29.a09cac2b772af5
>>        format: 2
>>        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, data-pool
>>        op_features:
>>        flags:
>>        create_timestamp: Mon Apr 12 18:44:38 2021
>>        access_timestamp: Mon Apr 12 18:44:38 2021
>>        modify_timestamp: Mon Apr 12 18:44:38 2021
> 
> 
> Any other ideas or hints are greatly appreciated.
> 
> Thanks,
> Reed
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx