Re: Slow requests from bluestore osds

Marc Schöchlin <ms@xxxxxxxxxx> · Thu, 6 Sep 2018 09:39:13 +0200

Hello Uwe,

as described in my mail we are running 4.13.0-39.

In conjunction with some later mails of this thread it seems that this problem might related to os/microcode (spectre) updates.
I am planning a ceph/ubuntu upgrade in the next week because of various reasons, let's see what happens.....

Regards Marc

Am 05.09.2018 um 20:24 schrieb Uwe Sauter:
> I'm also experiencing slow requests though I cannot point it to scrubbing.
>
> Which kernel do you run? Would you be able to test against the same kernel with Spectre/Meltdown mitigations disabled ("noibrs noibpb nopti nospectre_v2" as boot option)?
>
>     Uwe
>
> Am 05.09.18 um 19:30 schrieb Brett Chancellor:
>> Marc,
>>    As with you, this problem manifests itself only when the bluestore OSD is involved in some form of deep scrub.  Anybody have any insight on what might be causing this?
>>
>> -Brett
>>
>> On Mon, Sep 3, 2018 at 4:13 AM, Marc Schöchlin <ms@xxxxxxxxxx <mailto:ms@xxxxxxxxxx>> wrote:
>>
>>     Hi,
>>
>>     we are also experiencing this type of behavior for some weeks on our not
>>     so performance critical hdd pools.
>>     We haven't spent so much time on this problem, because there are
>>     currently more important tasks - but here are a few details:
>>
>>     Running the following loop results in the following output:
>>
>>     while true; do ceph health|grep -q HEALTH_OK || (date;  ceph health
>>     detail); sleep 2; done
>>
>>     Sun Sep  2 20:59:47 CEST 2018
>>     HEALTH_WARN 4 slow requests are blocked > 32 sec
>>     REQUEST_SLOW 4 slow requests are blocked > 32 sec
>>          4 ops are blocked > 32.768 sec
>>          osd.43 has blocked requests > 32.768 sec
>>     Sun Sep  2 20:59:50 CEST 2018
>>     HEALTH_WARN 4 slow requests are blocked > 32 sec
>>     REQUEST_SLOW 4 slow requests are blocked > 32 sec
>>          4 ops are blocked > 32.768 sec
>>          osd.43 has blocked requests > 32.768 sec
>>     Sun Sep  2 20:59:52 CEST 2018
>>     HEALTH_OK
>>     Sun Sep  2 21:00:28 CEST 2018
>>     HEALTH_WARN 1 slow requests are blocked > 32 sec
>>     REQUEST_SLOW 1 slow requests are blocked > 32 sec
>>          1 ops are blocked > 32.768 sec
>>          osd.41 has blocked requests > 32.768 sec
>>     Sun Sep  2 21:00:31 CEST 2018
>>     HEALTH_WARN 7 slow requests are blocked > 32 sec
>>     REQUEST_SLOW 7 slow requests are blocked > 32 sec
>>          7 ops are blocked > 32.768 sec
>>          osds 35,41 have blocked requests > 32.768 sec
>>     Sun Sep  2 21:00:33 CEST 2018
>>     HEALTH_WARN 7 slow requests are blocked > 32 sec
>>     REQUEST_SLOW 7 slow requests are blocked > 32 sec
>>          7 ops are blocked > 32.768 sec
>>          osds 35,51 have blocked requests > 32.768 sec
>>     Sun Sep  2 21:00:35 CEST 2018
>>     HEALTH_WARN 7 slow requests are blocked > 32 sec
>>     REQUEST_SLOW 7 slow requests are blocked > 32 sec
>>          7 ops are blocked > 32.768 sec
>>          osds 35,51 have blocked requests > 32.768 sec
>>
>>     Our details:
>>
>>        * system details:
>>          * Ubuntu 16.04
>>           * Kernel 4.13.0-39
>>           * 30 * 8 TB Disk (SEAGATE/ST8000NM0075)
>>           * 3* Dell Power Edge R730xd (Firmware 2.50.50.50)
>>             * Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
>>             * 2*10GBITS SFP+ Network Adapters
>>             * 192GB RAM
>>           * Pools are using replication factor 3, 2MB object size,
>>             85% write load, 1700 write IOPS/sec
>>             (ops mainly between 4k and 16k size), 300 read IOPS/sec
>>        * we have the impression that this appears on deepscrub/scrub activity.
>>        * Ceph 12.2.5, we alread played with the osd settings OSD Settings
>>          (our assumtion was that the problem is related to rocksdb compaction)
>>          bluestore cache kv max = 2147483648
>>          bluestore cache kv ratio = 0.9
>>          bluestore cache meta ratio = 0.1
>>          bluestore cache size hdd = 10737418240
>>        * this type problem only appears on hdd/bluestore osds, ssd/bluestore
>>          osds did never experienced that problem
>>        * the system is healthy, no swapping, no high load, no errors in dmesg
>>
>>     I attached a log excerpt of osd.35 - probably this is useful for
>>     investigating the problem is someone owns deeper bluestore knowledge.
>>     (slow requests appeared on Sun Sep  2 21:00:35)
>>
>>     Regards
>>     Marc
>>
>>
>>     Am 02.09.2018 um 15:50 schrieb Brett Chancellor:
>>     > The warnings look like this.     >
>>     > 6 ops are blocked > 32.768 sec on osd.219
>>     > 1 osds have slow requests
>>     >
>>     > On Sun, Sep 2, 2018, 8:45 AM Alfredo Deza <adeza@xxxxxxxxxx <mailto:adeza@xxxxxxxxxx>
>>     > <mailto:adeza@xxxxxxxxxx <mailto:adeza@xxxxxxxxxx>>> wrote:
>>     >
>>     >     On Sat, Sep 1, 2018 at 12:45 PM, Brett Chancellor
>>      >     <bchancellor@xxxxxxxxxxxxxx <mailto:bchancellor@xxxxxxxxxxxxxx> <mailto:bchancellor@xxxxxxxxxxxxxx
>>     <mailto:bchancellor@xxxxxxxxxxxxxx>>>
>>      >     wrote:
>>      >     > Hi Cephers,
>>      >     >   I am in the process of upgrading a cluster from Filestore to
>>      >     bluestore,
>>      >     > but I'm concerned about frequent warnings popping up against the new
>>      >     > bluestore devices. I'm frequently seeing messages like this,
>>      >     although the
>>      >     > specific osd changes, it's always one of the few hosts I've
>>      >     converted to
>>      >     > bluestore.
>>      >     >
>>      >     > 6 ops are blocked > 32.768 sec on osd.219
>>      >     > 1 osds have slow requests
>>      >     >
>>      >     > I'm running 12.2.4, have any of you seen similar issues? It
>>      >     seems as though
>>      >     > these messages pop up more frequently when one of the bluestore
>>      >     pgs is
>>      >     > involved in a scrub.  I'll include my bluestore creation process
>>      >     below, in
>>      >     > case that might cause an issue. (sdb, sdc, sdd are SATA, sde and
>>      >     sdf are
>>      >     > SSD)
>>      >
>>      >     Would be useful to include what those warnings say. The ceph-volume
>>      >     commands look OK to me
>>      >
>>      >     >
>>      >     >
>>      >     > ## Process used to create osds
>>      >     > sudo ceph-disk zap /dev/sdb /dev/sdc /dev/sdd /dev/sdd /dev/sde
>>      >     /dev/sdf
>>      >     > sudo ceph-volume lvm zap /dev/sdb
>>      >     > sudo ceph-volume lvm zap /dev/sdc
>>      >     > sudo ceph-volume lvm zap /dev/sdd
>>      >     > sudo ceph-volume lvm zap /dev/sde
>>      >     > sudo ceph-volume lvm zap /dev/sdf
>>      >     > sudo sgdisk -n 0:2048:+133GiB -t 0:FFFF -c 1:"ceph block.db sdb"
>>      >     /dev/sdf
>>      >     > sudo sgdisk -n 0:0:+133GiB -t 0:FFFF -c 2:"ceph block.db sdc"
>>      >     /dev/sdf
>>      >     > sudo sgdisk -n 0:0:+133GiB -t 0:FFFF -c 3:"ceph block.db sdd"
>>      >     /dev/sdf
>>      >     > sudo sgdisk -n 0:0:+133GiB -t 0:FFFF -c 4:"ceph block.db sde"
>>      >     /dev/sdf
>>      >     > sudo ceph-volume lvm create --bluestore --crush-device-class hdd
>>      >     --data
>>      >     > /dev/sdb --block.db /dev/sdf1
>>      >     > sudo ceph-volume lvm create --bluestore --crush-device-class hdd
>>      >     --data
>>      >     > /dev/sdc --block.db /dev/sdf2
>>      >     > sudo ceph-volume lvm create --bluestore --crush-device-class hdd
>>      >     --data
>>      >     > /dev/sdd --block.db /dev/sdf3
>>      >     >
>>      >     >
>>      >     > _______________________________________________
>>      >     > ceph-users mailing list
>>      >     > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> <mailto:ceph-users@xxxxxxxxxxxxxx
>>     <mailto:ceph-users@xxxxxxxxxxxxxx>>
>>      >     > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>      >     >
>>      >
>>      >
>>      >
>>      > _______________________________________________
>>      > ceph-users mailing list
>>      > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>>      > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>
>>
>>
>>     _______________________________________________
>>     ceph-users mailing list
>>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com