Re: long blocking with writes on rbds

Ilya Dryomov <idryomov@xxxxxxxxx> · Thu, 9 Apr 2015 13:40:51 +0300

On Wed, Apr 8, 2015 at 7:36 PM, Lionel Bouton <lionel+ceph@xxxxxxxxxxx> wrote:
> On 04/08/15 18:24, Jeff Epstein wrote:
>> Hi, I'm having sporadic very poor performance running ceph. Right now
>> mkfs, even with nodiscard, takes 30 mintes or more. These kind of
>> delays happen often but irregularly .There seems to be no common
>> denominator. Clearly, however, they make it impossible to deploy ceph
>> in production.
>>
>> I reported this problem earlier on ceph's IRC, and was told to add
>> nodiscard to mkfs. That didn't help. Here is the command that I'm
>> using to format an rbd:
>>
>> For example: mkfs.ext4 -text4 -m0 -b4096 -E nodiscard /dev/rbd1
>
> I probably won't be able to help much, but people knowing more will need
> at least:
> - your Ceph version,
> - the kernel version of the host on which you are trying to format
> /dev/rbd1,
> - which hardware and network you are using for this cluster (CPU, RAM,
> HDD or SSD models, network cards, jumbo frames, ...).
>
>>
>> Ceph says everything is okay:
>>
>>     cluster e96e10d3-ad2b-467f-9fe4-ab5269b70206
>>      health HEALTH_OK
>>      monmap e1: 3 mons at
>> {a=192.168.224.4:6789/0,b=192.168.232.4:6789/0,c=192.168.240.4:6789/0}, election
>> epoch 12, quorum 0,1,2 a,b,c
>>      osdmap e972: 6 osds: 6 up, 6 in
>>       pgmap v4821: 4400 pgs, 44 pools, 5157 MB data, 1654 objects
>>             46138 MB used, 1459 GB / 1504 GB avail
>>                 4400 active+clean

Are there any "slow request" warnings in the logs?

Assuming a 30 minute mkfs is somewhat reproducible, can you bump osd
and ms log levels and try to capture it?

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com