Re: High disk utilisation

"MATHIAS, Bryn (Bryn)" <bryn.mathias@xxxxxxxxxxxxxxxxxx> · Mon, 30 Nov 2015 07:55:24 +0000

Hi Christian,

I’ll give you a much better dump of detail :)

Running RHEL 7.1,
ceph version 0.94.5

all ceph disks are xfs, with journals on a partition on the disk
Disks: 6Tb spinners.

Erasure coded pool with 4+1 EC ISA-L also.

No scrubbing reported in the ceph log, the cluster isn’t old enough yet to be doing any deep scrubbing.
Also the cpu usage of the osd deamon that controls the disk isn’t spiking which I have seen previously when scrubbing or deep scrubbing is taking place.

All disks are at 2% utilisation as given by df.

For explicitness:

[root@au-sydney ~]# ceph -s
    cluster ff900f17-7eec-4fe1-8f31-657d44b86a22
     health HEALTH_OK
     monmap e5: 5 mons at {au-adelaide=10.50.21.24:6789/0,au-brisbane=10.50.21.22:6789/0,au-canberra=10.50.21.23:6789/0,au-melbourne=10.50.21.21:6789/0,au-sydney=10.50.21.20:6789/0}
            election epoch 274, quorum 0,1,2,3,4 au-sydney,au-melbourne,au-brisbane,au-canberra,au-adelaide
     osdmap e8549: 120 osds: 120 up, 120 in
      pgmap v408422: 8192 pgs, 2 pools, 7794 GB data, 5647 kobjects
            9891 GB used, 644 TB / 654 TB avail
                8192 active+clean
  client io 68363 kB/s wr, 1249 op/s

Cheers,
Bryn

On 30 Nov 2015, at 12:57, Christian Balzer <chibi@xxxxxxx> wrote:

Hello,

On
 Mon, 30 Nov 2015 07:15:35 +0000 MATHIAS, Bryn (Bryn) wrote:

Hi All,

I am seeing an issue with ceph performance.

Starting from an empty cluster of 5 nodes, ~600Tb of storage.

It
 would be helpful to have more details (all details in fact) than this.

Complete
 HW, OS, FS used, Ceph versions and configuration details (journals

on
 HDD, replication levels etc).

While
 this might not seem significant to your current question, it might

prove
 valuable as to why you're seeing performance problems and how to

address
 them.

monitoring disk usage in nmon I see rolling 100% usage of a disk.

Ceph -w doesn’t report any spikes in throughput and the application

putting data is not spiking in the load generated.

The
 ceph.log should give a more detailed account, but assuming your client

side
 is indeed steady state, this could be very well explained by

scrubbing,
 especially deep-scrubbing. 

That
 should also be visible in the ceph.log. 

Christian

│sdg2       0%    0.0  537.5|

                          |

│ │sdh     2%    4.0

4439.8|RW

                                                                                                                                                        │

│sdh1     2%    4.0

3972.3|RW

                                                                                                                                                         │

│sdh2       0%    0.0  467.6|

                            |

│ │sdj     3%    2.0

3524.7|RW

                                                                                                                                                       │

│sdj1     3%    2.0

3488.7|RW

                                                                                                                                                         │

│sdj2       0%    0.0   36.0|

                    |

│ │sdk       99% 1144.9

3564.6|RRRRRRRRRRRRRWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW>

│

│sdk1      99% 1144.9

3254.9|RRRRRRRRRRRRRWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW>

│ │sdk2       0%    0.0  309.7|W

                                 |

│ │sdl        1%    4.0  955.1|R

                   |

│ │sdl1       1%    4.0  791.3|R

                   |

│

│sdl2       0%    0.0  163.8|

                          |

Is this anything to do with the way objects are stored on the file

system? I remember reading that as the number of objects grow the files

on disk are re-orginised?

This issue for obvious reasons causes a large degradation in

performance, is there a way of mitigating it? Will this go away as my

cluster reaches a higher level of disk utilisation?

Kind Regards,

Bryn Mathias

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 

Christian
 Balzer        Network/Systems Engineer                

chibi@xxxxxxx   
Global
 OnLine Japan/Fusion Communications

http://www.gol.com/

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com