Re: osd slow response when formatting rbd image

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

 

today i probably found a solution for this (unfortunately not the reason)

The problem only occurs when using ceph-kraken version on my clients.

If i use ceph-jewel (which was running on my iscsi gateways) the problem does not appear.

 

Best regards,

Sven

 

 

Von: Rath, Sven
Gesendet: Donnerstag, 20. April 2017 16:34
An: 'ceph-users@xxxxxxxxxxxxxx' <ceph-users@xxxxxxxxxxxxxx>
Betreff: osd slow response when formatting rbd image

 

Hi all,

 

hope you are all doing well and maybe some of you can help me with a problem i’m focusing recently.

I started to evaluate ceph a couple of months ago and I now have a very strange problem while formatting rbd images.

The Problem only occurs when using rbd images directly with the kernel rbd module loaded.

If I add the rbd image via one of our iscsi gateways (tgt iscsi) as iscsi device, formatting is no problem and I can afterwards use the rbd image on any host without problems.

Thats a workaround but i would like to find why it is not working with rbd directly for me…

 

Problem explained:

 

I create a pool and an image:

 

ceph osd pool create pool-C 250 250

rbd create test --size 3548290 --pool pool-C --image-feature layering

 

I map the rbd image to my client (doesnt matter which client)

rbd map pool-C/test --id admin --keyring /etc/ceph/ceph.client.admin.keyring

 

As soon as I start to format (xfs or ext4) the image some of my osds start to fail:

 

mkfs.xfs /dev/rbd/pool-C/test-part1

 

I see the following entrys as soon as i start formatting:

The OSD IDs are different each time. I guess ist also more a problem with the journals.

 

EXAMPLE:

2017-04-20 13:43:24.529953 osd.1 [WRN] slow request 30.439722 seconds old, received at 2017-04-20 13:42:54.090001: osd_op(client.344964.1:8170 9.cbb68aa5 rbd_data.540dc238e1f29.0000000000001e7c [delete] snapc 0=[] ondisk+write e2002) currently started

2017-04-20 13:43:24.529984 osd.1 [WRN] slow request 30.431389 seconds old, received at 2017-04-20 13:42:54.098334: osd_op(client.344964.1:8489 9.f414989 rbd_data.540dc238e1f29.0000000000001fbb [delete] snapc 0=[] ondisk+write e2002) currently started

 

2017-04-20 13:42:50.870651 mon.0 [INF] osd.10 172.10.10.2:6804/15031 failed (forced)

2017-04-20 13:42:51.989500 mon.0 [INF] osd.11 172.10.10.2:6806/15690 failed (forced)

 

I found out that allways some of the Journal SSDs are disappearing when starting to format and therefor the osds on this journal too.

Thats really strange to me.

Any benchmark is working fine and also if the rbd image is formatted via iscsi I can use it without any problems.

 

 

My environment:

 

ceph-11.2.0-0.el7.x86_64 on CentOS 7.3

 

3 Monitor Hosts

 

2 OSD Hosts:

2x Intel(R) Xeon(R) CPU E5-2630L v4 (HT on, C1-state

60 GB Memory

1 GE Ethernet (internal)

1 GE Ethernet (external)

 

4 x SSD (Intel SSDSC2BB480G401)

8 x 1,1TB SAS3 (XFS)

All disks connected to a HBA, no RAID arrays: PMC Adaptec HBA 1000-8i8e

 

[agpceph02][DEBUG ] /dev/sda :

[agpceph02][DEBUG ]  /dev/sda1 ceph journal, for /dev/sde1

[agpceph02][DEBUG ]  /dev/sda2 ceph journal, for /dev/sdf1

[agpceph02][DEBUG ] /dev/sdb :

[agpceph02][DEBUG ]  /dev/sdb1 ceph journal, for /dev/sdg1

[agpceph02][DEBUG ]  /dev/sdb2 ceph journal, for /dev/sdh1

[agpceph02][DEBUG ] /dev/sdc :

[agpceph02][DEBUG ]  /dev/sdc1 ceph journal, for /dev/sdi1

[agpceph02][DEBUG ]  /dev/sdc2 ceph journal, for /dev/sdj1

[agpceph02][DEBUG ] /dev/sdd :

[agpceph02][DEBUG ]  /dev/sdd1 ceph journal, for /dev/sdk1

[agpceph02][DEBUG ]  /dev/sdd2 ceph journal, for /dev/sdl1

[agpceph02][DEBUG ] /dev/sde :

[agpceph02][DEBUG ]  /dev/sde1 ceph data, active, cluster ceph, osd.8, journal /dev/sda1

[agpceph02][DEBUG ] /dev/sdf :

[agpceph02][DEBUG ]  /dev/sdf1 ceph data, active, cluster ceph, osd.9, journal /dev/sda2

[agpceph02][DEBUG ] /dev/sdg :

[agpceph02][DEBUG ]  /dev/sdg1 ceph data, active, cluster ceph, osd.10, journal /dev/sdb1

[agpceph02][DEBUG ] /dev/sdh :

[agpceph02][DEBUG ]  /dev/sdh1 ceph data, active, cluster ceph, osd.11, journal /dev/sdb2

[agpceph02][DEBUG ] /dev/sdi :

[agpceph02][DEBUG ]  /dev/sdi1 ceph data, active, cluster ceph, osd.12, journal /dev/sdc1

[agpceph02][DEBUG ] /dev/sdj :

[agpceph02][DEBUG ]  /dev/sdj1 ceph data, active, cluster ceph, osd.13, journal /dev/sdc2

[agpceph02][DEBUG ] /dev/sdk :

[agpceph02][DEBUG ]  /dev/sdk1 ceph data, active, cluster ceph, osd.14, journal /dev/sdd1

[agpceph02][DEBUG ] /dev/sdl :

[agpceph02][DEBUG ]  /dev/sdl1 ceph data, active, cluster ceph, osd.15, journal /dev/sdd2

 

 

[root@agpceph-admin ceph]# ceph -s

    cluster 8edd3cdc-02c3-4b60-a150-897aeb0dda14

     health HEALTH_OK

     monmap e3: 3 mons at {agpceph-mon01=172.10.10.50:6789/0,agpceph01=172.10.10.1:6789/0,agpceph02=172.10.10.2:6789/0}

            election epoch 110, quorum 0,1,2 agpceph01,agpceph02,agpceph-mon01

        mgr active: agpceph-mon01 standbys: agpceph01, agpceph02

     osdmap e2132: 16 osds: 16 up, 16 in

            flags sortbitwise,require_jewel_osds,require_kraken_osds

      pgmap v96164: 650 pgs, 3 pools, 15611 MB data, 3979 objects

            31517 MB used, 17845 GB / 17876 GB avail

                 650 active+clean

 

[root@agpceph-admin ceph]# ceph osd tree

ID WEIGHT   TYPE NAME              UP/DOWN REWEIGHT PRIMARY-AFFINITY

-1 17.45752 root default

-5  8.72876     rack Rack391

-2  8.72876         host agpceph01

0  1.09109             osd.0           up  1.00000          1.00000

1  1.09109             osd.1           up  1.00000          1.00000

2  1.09109             osd.2           up  1.00000          1.00000

3  1.09109             osd.3           up  1.00000          1.00000

4  1.09109             osd.4           up  1.00000          1.00000

5  1.09109             osd.5           up  1.00000          1.00000

6  1.09109             osd.6           up  1.00000          1.00000

7  1.09109             osd.7           up  1.00000          1.00000

-4  8.72876     rack Rack320

-3  8.72876         host agpceph02

8  1.09109             osd.8           up  1.00000          1.00000

9  1.09109             osd.9           up  1.00000          1.00000

10  1.09109             osd.10          up  1.00000          1.00000

11  1.09109             osd.11          up  1.00000          1.00000

12  1.09109             osd.12          up  1.00000          1.00000

13  1.09109             osd.13          up  1.00000          1.00000

14  1.09109             osd.14          up  1.00000          1.00000

15  1.09109             osd.15          up  1.00000          1.00000

 

[root@agpceph-admin ceph]# cat /etc/ceph/ceph.conf

[global]

fsid = 8edd3cdc-02c3-4b60-a150-897aeb0dda14

mon_initial_members = agpceph01, agpceph02, agpceph-mon01

mon_host = 172.10.10.1,172.10.10.2,172.10.10.50

auth_cluster_required = cephx

auth_service_required = cephx

auth_client_required = cephx

 

osd journal size = 81920

public network = 172.10.10.0/24

cluster network = 172.10.11.0/24

 

osd pool default size =  2

osd pool default min size = 1

osd pool default pg num = 35

osd pool default pgp num = 35

 

osd crush chooseleaf type = 3

 

log file = /var/log/ceph/cluster.log

log to syslog = true

mon_allow_pool_delete = true

mon osd allow primary affinity = true

 

[client]

rbd_cache = false

 

 

 

 

Maybe someone also had this problem and could give me any advice ?

 

Many thanks in advance and kind regards,

Sven

 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux