Re: long blocking with writes on rbds

Lionel Bouton <lionel+ceph@xxxxxxxxxxx> · Wed, 08 Apr 2015 18:36:42 +0200

On 04/08/15 18:24, Jeff Epstein wrote:
> Hi, I'm having sporadic very poor performance running ceph. Right now
> mkfs, even with nodiscard, takes 30 mintes or more. These kind of
> delays happen often but irregularly .There seems to be no common
> denominator. Clearly, however, they make it impossible to deploy ceph
> in production.
>
> I reported this problem earlier on ceph's IRC, and was told to add
> nodiscard to mkfs. That didn't help. Here is the command that I'm
> using to format an rbd:
>
> For example: mkfs.ext4 -text4 -m0 -b4096 -E nodiscard /dev/rbd1

I probably won't be able to help much, but people knowing more will need
at least:
- your Ceph version,
- the kernel version of the host on which you are trying to format
/dev/rbd1,
- which hardware and network you are using for this cluster (CPU, RAM,
HDD or SSD models, network cards, jumbo frames, ...).

>
> Ceph says everything is okay:
>
>     cluster e96e10d3-ad2b-467f-9fe4-ab5269b70206
>      health HEALTH_OK
>      monmap e1: 3 mons at
> {a=192.168.224.4:6789/0,b=192.168.232.4:6789/0,c=192.168.240.4:6789/0}, election
> epoch 12, quorum 0,1,2 a,b,c
>      osdmap e972: 6 osds: 6 up, 6 in
>       pgmap v4821: 4400 pgs, 44 pools, 5157 MB data, 1654 objects
>             46138 MB used, 1459 GB / 1504 GB avail
>                 4400 active+clean
>

There's only one thing surprising me here: you have only 6 OSDs, 1504GB
(~ 250G / osd) and a total of 4400 pgs ? With a replication of 3 this is
2200 pgs / OSD, which might be too much and unnecessarily increase the
load on your OSDs.

Best regards,

Lionel Bouton
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com