Re: Slow Ceph: Any plans on torrent-like transfers from OSDs ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 9 Sep 2018 11:20:01 +0200
Alex Lupsa <alex@xxxxxxxx> wrote:

> Hi,
> Any ideas about the below ?

Don't use consumer grade ssd for Ceph cache/block.db/bcache.



> Thanks,
> Alex
> 
> ----------
> Hi,
> I have a really small homelab 3-node ceph cluster on consumer hw -
> thanks to Proxmox for making it easy to deploy it.
> The problem I am having is very very bad transfer rates, ie 20mb/sec
> for both read and write on 17 OSDs with cache layer.
> However during recovery the speed hover between 250 to 700mb/sec which
> proves that the cluster IS capable of reaching way above those
> 20mb/sec in KVM.
> 
> Reading the documentation, I see that during recovery "nearly all OSDs
> participate in resilvering a new drive" - kind of a torrent of data
> incoming from multiple sources at once, causing a huge deluge.
> 
> However I believe this does not happen during the normal transfers,
> so my question is simply - is there any hidden tunables I can enable
> for this with the implied cost of network and heavy usage of disks ?
> Will there be in the future if not ?
> 
> I have tried disabling authx, upgrading the network to 10gbit, have
> bigger journals, more bluestore cache and disabled the debugging logs
> as it has been advised on the list. The only thing that did help a
> bit was cache tiering, but this only helps somewhat as the ops do not
> get promoted unless I am very adamant about keeping programs in KVM
> open for very long times so that the writes/reads are promoted.
> To add some to the injury, once the cache gets full - the whole 3-node
> cluster grinds to a full halt until I start forcefully evict data
> from the cache... manually!
> So I am therefore guessing a really bad misconfiguration from my side.
> 
> Next step would be removing the cache layer and using those SSDs as
> bcache instead as it seems to yeld 5x the results, even though it
> does add yet another layer of complexity and RAM requirements.
> 
> Full config details:https://pastebin.com/xUM7VF9k
> 
> rados bench -p ceph_pool 30 write
> Total time run:         30.983343
> Total writes made:      762
> Write size:             4194304
> Object size:            4194304
> Bandwidth (MB/sec):     98.3754
> Stddev Bandwidth:       20.9586
> Max bandwidth (MB/sec): 132
> Min bandwidth (MB/sec): 16
> Average IOPS:           24
> Stddev IOPS:            5
> Max IOPS:               33
> Min IOPS:               4
> Average Latency(s):     0.645017
> Stddev Latency(s):      0.326411
> Max latency(s):         2.08067
> Min latency(s):         0.0355789
> Cleaning up (deleting benchmark objects)
> Removed 762 objects
> Clean up completed and total clean up time :3.925631
> 
> Thanks,
> Alex



-- 
Pozdrawiam
Jarosław Mociak - Nettelekom GK Sp. z o.o.

Attachment: pgpiHgg8cf3ce.pgp
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux