TR: High apply latency on OSD causes poor performance on VM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

 

Could you take a look on my problem.

It’s about high latency on my OSDs on HP G8 servers (ceph01, ceph02 and ceph03).

When I run a rados bench for 60 sec, the results are surprising : after a few seconds,  there is no traffic, then it’s resume, etc.

Finally, the maximum latency is high and VM’s disks freeze lot.

 

#rados bench -p pool-test-g8 60 write

Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds or 0 objects

Object prefix: benchmark_data_ceph02_56745

   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat

     0       0         0         0         0         0         -         0

     1      16        82        66   263.959       264 0.0549584  0.171148

     2      16       134       118    235.97       208  0.344873  0.232103

     3      16       189       173   230.639       220  0.015583   0.24581

     4      16       248       232   231.973       236 0.0704699  0.252504

     5      16       306       290   231.974       232 0.0229872  0.258343

     6      16       371       355    236.64       260   0.27183  0.255469

    7      16       419       403    230.26       192 0.0503492  0.263304

     8      16       460       444   221.975       164 0.0157241  0.261779

     9      16       506       490   217.754       184  0.199418  0.271501

    10      16       518       502   200.778        48 0.0472324  0.269049

    11      16       518       502   182.526         0         -  0.269049

    12      16       556       540   179.981        76  0.100336  0.301616

    13      16       607       591   181.827       204  0.173912  0.346105

    14      16       655       639   182.552       192 0.0484904  0.339879

    15      16       683       667   177.848       112 0.0504184  0.349929

    16      16       746       730   182.481       252  0.276635  0.347231

    17      16       807       791   186.098       244  0.391491  0.339275

    18      16       845       829   184.203       152  0.188608  0.342021

    19      16       850       834   175.561        20  0.960175  0.342717

2015-05-28 17:09:48.397376min lat: 0.013532 max lat: 6.28387 avg lat: 0.346987

   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat

    20      16       859       843   168.582        36 0.0182246  0.346987

    21      16       863       847   161.316        16   3.18544  0.355051

    22      16       897       881   160.165       136 0.0811037  0.371209

    23      16       901       885   153.897        16 0.0482124  0.370793

    24      16       943       927   154.484       168   0.63064  0.397204

    25      15       997       982   157.104       220 0.0933448  0.392701

    26      16      1058      1042   160.291       240  0.166463  0.385943

    27      16      1088      1072   158.798       120   1.63882  0.388568

    28      16      1125      1109   158.412       148 0.0511479   0.38419

    29      16      1155      1139   157.087       120  0.162266  0.385898

    30      16      1163      1147   152.917        32 0.0682181  0.383571

    31      16      1190      1174   151.468       108 0.0489185  0.386665

    32      16      1196      1180   147.485        24   2.95263  0.390657

    33      16      1213      1197   145.076        68 0.0467788  0.389299

    34      16      1265      1249   146.926       208 0.0153085  0.420687

    35      16      1332      1316   150.384       268 0.0157061   0.42259

    36      16      1374      1358   150.873       168  0.251626  0.417373

    37      16      1402      1386   149.822       112 0.0475302  0.413886

    38      16      1444      1428     150.3       168 0.0507577  0.421055

    39      16      1500      1484   152.189       224 0.0489163  0.416872

2015-05-28 17:10:08.399434min lat: 0.013532 max lat: 9.26596 avg lat: 0.415296

   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat

   40      16      1530      1514   151.384       120  0.951713  0.415296

    41      16      1551      1535   149.741        84 0.0686787  0.416571

    42      16      1606      1590   151.413       220 0.0826855   0.41684

    43      16      1656      1640   152.542       200 0.0706539  0.409974

    44      16      1663      1647   149.712        28  0.046672  0.408476

    45      16      1685      1669    148.34        88 0.0989566  0.424918

    46      16      1707      1691   147.028        88 0.0490569  0.421116

    47      16      1707      1691     143.9         0         -  0.421116

    48      16      1707      1691   140.902         0         -  0.421116

    49      16      1720      1704   139.088   17.3333 0.0480335  0.428997

    50      16      1752      1736   138.866       128  0.053219    0.4416

    51      16      1786      1770   138.809       136  0.602946  0.440357

    52      16      1810      1794   137.986        96 0.0472518  0.438376

    53      16      1831      1815   136.967        84 0.0148999  0.446801

    54      16      1831      1815    134.43         0         -  0.446801

    55      16      1853      1837   133.586        44 0.0499486  0.455561

    56      16      1898      1882   134.415       180 0.0566593  0.461019

    57      16      1932      1916   134.442       136 0.0162902  0.454385

    58      16      1948      1932   133.227        64   0.62188  0.464403

    59      16      1966      1950    132.19        72  0.563613  0.472147

2015-05-28 17:10:28.401525min lat: 0.013532 max lat: 12.4828 avg lat: 0.472084

   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat

    60      16      1983      1967    131.12        68  0.030789  0.472084

    61      16      1984      1968   129.036         4 0.0519125  0.471871

    62      16      1984      1968   126.955         0         -  0.471871

    63      16      1984      1968   124.939         0         -  0.471871

    64      14      1984      1970   123.112   2.66667   4.20878  0.476035

Total time run:         64.823355

Total writes made:      1984

Write size:             4194304

Bandwidth (MB/sec):     122.425

Stddev Bandwidth:       85.3816

Max bandwidth (MB/sec): 268

Min bandwidth (MB/sec): 0

Average Latency:        0.520956

Stddev Latency:         1.17678

Max latency:            12.4828

Min latency:            0.013532

 

 

I have installed a new ceph06 box which has best latencies but hardware is different (RAID card, disks, …).

All OSDs are formatted in XFS but are mounted differently :

-          On ceph box with high latency (strip size of raid logical disk is 256KB) :

/dev/sdd1 on /var/lib/ceph/osd/ceph-4 type xfs (rw,noatime,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota)

-          On ceph box with low latency (strip size of raid logical disk is 128KB) :

/dev/sdc1 on /var/lib/ceph/osd/ceph-9 type xfs (rw,noatime,attr2,inode64,noquota)

 

Thread on Proxmox Forum :  http://forum.proxmox.com/threads/22206-Ceph-High-apply-latency-on-OSD-causes-poor-performance-on-VM

 

If you have any idea ...

 

Thanks.

 

Franck ALLOUIS

franck.allouis@xxxxxxxx

 

 

De : Franck Allouis
Envoyé : jeudi 21 mai 2015 17:12
À : 'ceph-community@xxxxxxxxxxxxxx'
Objet : High apply latency on OSD causes poor performance on VM

 

Hi,

 

Since we have installing our new Ceph Cluster, we have frequently high apply latency on OSDs (near 200 ms to 1500 ms), while commit latency is continuously at 0 ms !

 

In Ceph documentation, when you run the command "ceph osd perf", the fs_commit_latency is generally higher than fs_apply_latency. For us it's the opposite.

The phenomenon has increased since we changed the Ceph version (migrate Giant 0.87.1 to Hammer 0.94.1)

The consequence is that our Windows VMs are very slow.

Does anyone could tell us if our configuration is good or not, and in what direction investigate ?

 

# ceph osd perf

osd fs_commit_latency(ms) fs_apply_latency(ms)

  0                     0                   62

  1                     0                  193

  2                     0                   88

  3                     0                  269

  4                     0                 1055

  5                     0                  322

  6                     0                  272

  7                     0                  116

  8                     0                  653

  9                     0                    4

10                     0                    1

11                     0                    7

12                     0                    4

 

Different informations in our configuration :

 

- Proxmox 3.4-6

- kernel : 3.10.0-10-pve

- CEPH :

- Hammer 0.94.1

- 3 hosts with 3 OSDs of 4 TB (9 OSDs) + 1 SSD of 500 GB per host for journals

- 1 host with 4 OSDs of 300 GB (4 OSDs) + 1 SSD of 500 GB for journals

 

- OSD Tree :

 

# ceph osd tree

ID WEIGHT   TYPE NAME           UP/DOWN REWEIGHT PRIMARY-AFFINITY

-1 33.83995 root default

-6 22.91995     room salle-dr

-2 10.92000         host ceph01

0  3.64000             osd.0        up  1.00000          1.00000

2  3.64000             osd.2        up  1.00000          1.00000

1  3.64000             osd.1        up  1.00000          1.00000

-3 10.92000         host ceph02

3  3.64000             osd.3        up  1.00000          1.00000

4  3.64000             osd.4        up  1.00000          1.00000

5  3.64000             osd.5        up  1.00000          1.00000

-5  1.07996         host ceph06

9  0.26999             osd.9        up  1.00000          1.00000

10  0.26999             osd.10       up  1.00000          1.00000

11  0.26999             osd.11       up  1.00000          1.00000

12  0.26999             osd.12       up  1.00000          1.00000

-7 10.92000     room salle-log

-4 10.92000         host ceph03

6  3.64000             osd.6        up  1.00000          1.00000

7  3.64000             osd.7        up  1.00000          1.00000

8  3.64000             osd.8        up  1.00000          1.00000

 

- ceph.conf :

 

[global]

         auth client required = cephx

         auth cluster required = cephx

         auth service required = cephx

         auth supported = cephx

         cluster network = 10.10.1.0/24

         filestore xattr use omap = true

         fsid = 2dbbec32-a464-4bc5-bb2b-983695d1d0c6

         keyring = /etc/pve/priv/$cluster.$name.keyring

         mon osd adjust heartbeat grace = true

         mon osd down out subtree limit = host

         osd disk threads = 24

         osd heartbeat grace = 10

         osd journal size = 5120

         osd max backfills = 1

         osd op threads = 24

         osd pool default min size = 1

         osd recovery max active = 1

         public network = 192.168.80.0/24

[osd]

         keyring = /var/lib/ceph/osd/ceph-$id/keyring

[mon.0]

         host = ceph01

         mon addr = 192.168.80.41:6789

[mon.1]

         host = ceph02

         mon addr = 192.168.80.42:6789

[mon.2]

         host = ceph03

         mon addr = 192.168.80.43:6789

 

Thanks.

Best regards

 

Franck ALLOUIS

franck.allouis@xxxxxxxx

 

 

 

NOTICE: This e-mail (including any attachments) may contain information that is private, confidential or legally privileged information or material and is intended solely for the use of the addressee(s). If you receive this e-mail in error, please delete it from your system without copying it and immediately notify the sender(s) by reply e-mail. Any unauthorized use or disclosure of this message is strictly prohibited. STEF does not guarantee the integrity of this transmission and may therefore never be liable if the message is altered or falsified nor for any virus, interception or damage to your system.

AVIS : Ce message (y compris toutes pièces jointes) peut contenir des informations privées, confidentielles et est pour l'usage du(es) seul(s) destinataire(s). Si vous avez reçu ce message par erreur, merci d'en avertir l'expéditeur par retour d'email immédiatement et de procéder à la destruction de l'ensemble des éléments reçus, dont vous ne devez garder aucune copie. Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il contient par une personne autre que le(les) destinataire(s) désigné(s) est interdite. STEF ne garantit pas l'intégrité de cette transmission et ne saurait être tenu responsable du message, de son contenu, de toute modification ou falsification, d’une interception ou de dégâts à votre système.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux