Re: [EXTERNAL] Laggy OSDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



we had issues with slow ops on ssd AND nvme; mostly fixed by raising aio-max-nr from 64K to 1M, eg "fs.aio-max-nr=1048576" if I remember correctly.

On 3/29/22, 2:13 PM, "Alex Closs" <acloss@xxxxxxxxxxxxx> wrote:

    Hey folks,

    We have a 16.2.7 cephadm cluster that's had slow ops and several (constantly changing) laggy PGs. The set of OSDs with slow ops seems to change at random, among all 6 OSD hosts in the cluster. All drives are enterprise SATA SSDs, by either Intel or Micron. We're still not ruling out a network issue, but wanted to troubleshoot from the Ceph side in case something broke there.

    ceph -s:

     health: HEALTH_WARN
     3 slow ops, oldest one blocked for 246 sec, daemons [osd.124,osd.130,osd.141,osd.152,osd.27] have slow ops.

     services:
     mon: 5 daemons, quorum ceph-osd10,ceph-mon0,ceph-mon1,ceph-osd9,ceph-osd11 (age 28h)
     mgr: ceph-mon0.sckxhj(active, since 25m), standbys: ceph-osd10.xmdwfh, ceph-mon1.iogajr
     osd: 143 osds: 143 up (since 92m), 143 in (since 2w)
     rgw: 3 daemons active (3 hosts, 1 zones)

     data:
     pools: 26 pools, 3936 pgs
     objects: 33.14M objects, 144 TiB
     usage: 338 TiB used, 162 TiB / 500 TiB avail
     pgs: 3916 active+clean
     19 active+clean+laggy
     1 active+clean+scrubbing+deep

     io:
     client: 59 MiB/s rd, 98 MiB/s wr, 1.66k op/s rd, 1.68k op/s wr

    This is actually much faster than it's been for much of the past hour, it's been as low as 50 kb/s and dozens of iops in both directions (where the cluster typically does 300MB to a few gigs, and ~4k iops)

    The cluster has been on 16.2.7 since a few days after release without issue. The only recent change was an apt upgrade and reboot on the hosts (which was last Friday and didn't show signs of problems).

    Happy to provide logs, let me know what would be useful. Thanks for reading this wall :)

    -Alex

    MIT CSAIL
    he/they
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux