Re: Investigating Config Error, 300x reduction in IOPs performance on RGW layer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Thu, Jul 18, 2019 at 3:44 AM Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:
I'm pretty new to RGW, but I'm needing to get max performance as well. Have you tried moving your RGW metadata pools to nvme? Carve out a bit of NVMe space and then pin the pool to the SSD class in CRUSH, that way the small metadata ops aren't on slow media.

no, don't do that:

1) a performance difference of 130 vs. 48k iopos is not due to SSD vs. NVMe for metadata unless the SSD is absolute crap
2) the OSDs already have an NVMe DB device, it's much easier to use it directly than by partioning the NVMes to create a separate partition as a normal OSD


Assuming your NVMe disks are a reasonable size (30GB per OSD): put the metadata pools on the HDDs. It's better to have 48 OSDs with 4 NVMes behind them handling metadata than only 4 OSDs with SSDs.

Running mons in VMs with gigabit network is fine for small clusters and not a performance problem


How are you benchmarking?

Paul
 
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Wed, Jul 17, 2019 at 5:59 PM Ravi Patel <ravi@xxxxxxxxxxxxxx> wrote:
Hello, 

We have deployed ceph cluster and we are trying to debug a massive drop in performance between the RADOS layer vs the RGW layer

## Cluster config
4 OSD nodes (12 Drives each, NVME Journals, 1 SSD drive) 40GbE NIC
2 RGW nodes ( DNS RR load balancing) 40GbE NIC
3 MON nodes 1 GbE NIC

## Pool configuration 
RGW data pool  - replicated 3x 4M stripe (HDD)
RGW metadata pool - replicated 3x (SSD) pool

## Benchmarks 
4K Read IOP/s performance using RADOS Bench 48,000~ IOP/s 
4K Read RGW performance via s3 interface ~ 130 IOP/s

Really trying to understand how to debug this issue. all the nodes never break 15% CPU utilization and there is plenty of RAM. The one pathological issue in our cluster is that the MON nodes are currently on VMs that are sitting behind a single 1 GbE NIC. (We are in the process of moving them, but are unsure if that will fix the issue. 

What metrics should we be looking at to debug the RGW layer. Where do we need to look?

---

Ravi Patel, PhD
Machine Learning Systems Lead


Kheiron Medical Technologies

kheironmed.com | supporting radiologists with deep learning


Kheiron Medical Technologies Ltd. is a registered company in England and Wales. This e-mail and its attachment(s) are intended for the above named only and are confidential. If they have come to you in error then you must take no action based upon them but contact us immediately. Any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. Although this e-mail and its attachments are believed to be free of any virus, it is the responsibility of the recipient to ensure that they are virus free. If you contact us by e-mail then we will store your name and address to facilitate communications. Any statements contained herein are those of the individual and not the organisation.

Registered number: 10184103. Registered office: RocketSpace, 40 Islington High Street, London, N1 8EQ

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux