Re: Slow requests from bluestore osds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm also experiencing slow requests though I cannot point it to scrubbing.

Which kernel do you run? Would you be able to test against the same kernel with Spectre/Meltdown mitigations disabled ("noibrs noibpb nopti nospectre_v2" as boot option)?

	Uwe

Am 05.09.18 um 19:30 schrieb Brett Chancellor:
Marc,
  As with you, this problem manifests itself only when the bluestore OSD is involved in some form of deep scrub. Anybody have any insight on what might be causing this?

-Brett

On Mon, Sep 3, 2018 at 4:13 AM, Marc Schöchlin <ms@xxxxxxxxxx <mailto:ms@xxxxxxxxxx>> wrote:

    Hi,

    we are also experiencing this type of behavior for some weeks on our not
    so performance critical hdd pools.
    We haven't spent so much time on this problem, because there are
    currently more important tasks - but here are a few details:

    Running the following loop results in the following output:

    while true; do ceph health|grep -q HEALTH_OK || (date;  ceph health
    detail); sleep 2; done

    Sun Sep  2 20:59:47 CEST 2018
    HEALTH_WARN 4 slow requests are blocked > 32 sec
    REQUEST_SLOW 4 slow requests are blocked > 32 sec
         4 ops are blocked > 32.768 sec
         osd.43 has blocked requests > 32.768 sec
    Sun Sep  2 20:59:50 CEST 2018
    HEALTH_WARN 4 slow requests are blocked > 32 sec
    REQUEST_SLOW 4 slow requests are blocked > 32 sec
         4 ops are blocked > 32.768 sec
         osd.43 has blocked requests > 32.768 sec
    Sun Sep  2 20:59:52 CEST 2018
    HEALTH_OK
    Sun Sep  2 21:00:28 CEST 2018
    HEALTH_WARN 1 slow requests are blocked > 32 sec
    REQUEST_SLOW 1 slow requests are blocked > 32 sec
         1 ops are blocked > 32.768 sec
         osd.41 has blocked requests > 32.768 sec
    Sun Sep  2 21:00:31 CEST 2018
    HEALTH_WARN 7 slow requests are blocked > 32 sec
    REQUEST_SLOW 7 slow requests are blocked > 32 sec
         7 ops are blocked > 32.768 sec
         osds 35,41 have blocked requests > 32.768 sec
    Sun Sep  2 21:00:33 CEST 2018
    HEALTH_WARN 7 slow requests are blocked > 32 sec
    REQUEST_SLOW 7 slow requests are blocked > 32 sec
         7 ops are blocked > 32.768 sec
         osds 35,51 have blocked requests > 32.768 sec
    Sun Sep  2 21:00:35 CEST 2018
    HEALTH_WARN 7 slow requests are blocked > 32 sec
    REQUEST_SLOW 7 slow requests are blocked > 32 sec
         7 ops are blocked > 32.768 sec
         osds 35,51 have blocked requests > 32.768 sec

    Our details:

       * system details:
         * Ubuntu 16.04
          * Kernel 4.13.0-39
          * 30 * 8 TB Disk (SEAGATE/ST8000NM0075)
          * 3* Dell Power Edge R730xd (Firmware 2.50.50.50)
            * Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
            * 2*10GBITS SFP+ Network Adapters
            * 192GB RAM
          * Pools are using replication factor 3, 2MB object size,
            85% write load, 1700 write IOPS/sec
            (ops mainly between 4k and 16k size), 300 read IOPS/sec
       * we have the impression that this appears on deepscrub/scrub activity.
       * Ceph 12.2.5, we alread played with the osd settings OSD Settings
         (our assumtion was that the problem is related to rocksdb compaction)
         bluestore cache kv max = 2147483648
         bluestore cache kv ratio = 0.9
         bluestore cache meta ratio = 0.1
         bluestore cache size hdd = 10737418240
       * this type problem only appears on hdd/bluestore osds, ssd/bluestore
         osds did never experienced that problem
       * the system is healthy, no swapping, no high load, no errors in dmesg

    I attached a log excerpt of osd.35 - probably this is useful for
    investigating the problem is someone owns deeper bluestore knowledge.
    (slow requests appeared on Sun Sep  2 21:00:35)

    Regards
    Marc


    Am 02.09.2018 um 15:50 schrieb Brett Chancellor:
> The warnings look like this. >
    > 6 ops are blocked > 32.768 sec on osd.219
    > 1 osds have slow requests
    >
    > On Sun, Sep 2, 2018, 8:45 AM Alfredo Deza <adeza@xxxxxxxxxx <mailto:adeza@xxxxxxxxxx>
    > <mailto:adeza@xxxxxxxxxx <mailto:adeza@xxxxxxxxxx>>> wrote:
    >
    >     On Sat, Sep 1, 2018 at 12:45 PM, Brett Chancellor
     >     <bchancellor@xxxxxxxxxxxxxx <mailto:bchancellor@xxxxxxxxxxxxxx> <mailto:bchancellor@xxxxxxxxxxxxxx
    <mailto:bchancellor@xxxxxxxxxxxxxx>>>
     >     wrote:
     >     > Hi Cephers,
     >     >   I am in the process of upgrading a cluster from Filestore to
     >     bluestore,
     >     > but I'm concerned about frequent warnings popping up against the new
     >     > bluestore devices. I'm frequently seeing messages like this,
     >     although the
     >     > specific osd changes, it's always one of the few hosts I've
     >     converted to
     >     > bluestore.
     >     >
     >     > 6 ops are blocked > 32.768 sec on osd.219
     >     > 1 osds have slow requests
     >     >
     >     > I'm running 12.2.4, have any of you seen similar issues? It
     >     seems as though
     >     > these messages pop up more frequently when one of the bluestore
     >     pgs is
     >     > involved in a scrub.  I'll include my bluestore creation process
     >     below, in
     >     > case that might cause an issue. (sdb, sdc, sdd are SATA, sde and
     >     sdf are
     >     > SSD)
     >
     >     Would be useful to include what those warnings say. The ceph-volume
     >     commands look OK to me
     >
     >     >
     >     >
     >     > ## Process used to create osds
     >     > sudo ceph-disk zap /dev/sdb /dev/sdc /dev/sdd /dev/sdd /dev/sde
     >     /dev/sdf
     >     > sudo ceph-volume lvm zap /dev/sdb
     >     > sudo ceph-volume lvm zap /dev/sdc
     >     > sudo ceph-volume lvm zap /dev/sdd
     >     > sudo ceph-volume lvm zap /dev/sde
     >     > sudo ceph-volume lvm zap /dev/sdf
     >     > sudo sgdisk -n 0:2048:+133GiB -t 0:FFFF -c 1:"ceph block.db sdb"
     >     /dev/sdf
     >     > sudo sgdisk -n 0:0:+133GiB -t 0:FFFF -c 2:"ceph block.db sdc"
     >     /dev/sdf
     >     > sudo sgdisk -n 0:0:+133GiB -t 0:FFFF -c 3:"ceph block.db sdd"
     >     /dev/sdf
     >     > sudo sgdisk -n 0:0:+133GiB -t 0:FFFF -c 4:"ceph block.db sde"
     >     /dev/sdf
     >     > sudo ceph-volume lvm create --bluestore --crush-device-class hdd
     >     --data
     >     > /dev/sdb --block.db /dev/sdf1
     >     > sudo ceph-volume lvm create --bluestore --crush-device-class hdd
     >     --data
     >     > /dev/sdc --block.db /dev/sdf2
     >     > sudo ceph-volume lvm create --bluestore --crush-device-class hdd
     >     --data
     >     > /dev/sdd --block.db /dev/sdf3
     >     >
     >     >
     >     > _______________________________________________
     >     > ceph-users mailing list
     >     > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> <mailto:ceph-users@xxxxxxxxxxxxxx
    <mailto:ceph-users@xxxxxxxxxxxxxx>>
     >     > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
     >     >
     >
     >
     >
     > _______________________________________________
     > ceph-users mailing list
     > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
     > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>



    _______________________________________________
    ceph-users mailing list
    ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux