Performance issues on Jewel 10.2.2.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hi,

We're having performance issues on a Jewel 10.2.2 cluster. It started with IOs taking several seconds to be acknowledged so we did some benchmarks.

We could reproduce with a rados bench on new pool set on a single host (R730xd with 2 SSDs in JBOD and 12 4TB NL SAS in RAID0 writeback) with no replication (min_size 1, size_1). We suspect this could be related to XFS filestore split operation or any other filestore operation.

Could someone have a look at this video : https://youtu.be/JQV3VfpAjbM?vq=hd1080

Video shows :

- admin node with commands and comments (top left)
- htop (middle left)
- rados bench (bottom left)
- iostat (top right)
- growing number of dirs in all PGs of that pool on osd.12 (/dev/sdd) and growing number of objects in the pool. (bottom right)

OSD debug log, perf report and osd params :

ceph-osd.12.log (http://u2l.fr/ceph-osd-12-log-tgz) with full debug log on from 12:00:26 to 12:00:36. On the video at 17'26" we can see that osd.12 (/dev/sdd) is 100% busy at 12:00:26. test_perf_report.txt (http://u2l.fr/test-perf-report-txt) based on perf.data from 12:02:50 to 12:03:44.
mom02h06_osd.12_config_show.txt (http://u2l.fr/osd-12-config-show)
mom02h06_osd.12_config_diff.txt (http://u2l.fr/osd-12-config-diff)
ceph-conf-osd-params.txt (http://u2l.fr/ceph-conf-osd-params)

Regards,

--

Frédéric Nass

Sous-direction Infrastructures
Direction du Numérique
Université de Lorraine

Tél : +33 3 72 74 11 35

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux