Re: format 2TB rbd device is too slow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 27, 2015 at 3:43 AM, huang jun <hjwsm1989@xxxxxxxxx> wrote:
> hi,llya
>
> 2015-08-26 23:56 GMT+08:00 Ilya Dryomov <idryomov@xxxxxxxxx>:
>> On Wed, Aug 26, 2015 at 6:22 PM, Haomai Wang <haomaiwang@xxxxxxxxx> wrote:
>>> On Wed, Aug 26, 2015 at 11:16 PM, huang jun <hjwsm1989@xxxxxxxxx> wrote:
>>>> hi,all
>>>> we create a 2TB rbd image, after map it to local,
>>>> then we format it to xfs with 'mkfs.xfs /dev/rbd0', it spent 318
>>>> seconds to finish, but  local physical disk with the same size just
>>>> need 6 seconds.
>>>>
>>>
>>> I think librbd have two PR related to this.
>>>
>>>> After debug, we found there are two steps in rbd module during formating:
>>>> a) send  233093 DELETE requests to osds(number_of_requests = 2TB / 4MB),
>>>>    this step spent almost 92 seconds.
>>>
>>> I guess this(https://github.com/ceph/ceph/pull/4221/files) may help
>>
>> It's submitting deletes for non-existent objects, not zeroing.  The
>> only thing that will really help here is the addition of rbd object map
>> support to the kernel client.  That could happen in 4.4, but 4.5 is
>> a safer bet.
>>
>>>
>>>> b) send 4238 messages like this: [set-alloc-hint object_size 4194304
>>>> write_size 4194304,write 0~512] to osds, that spent 227 seconds.
>>>
>>> I think kernel rbd also need to use
>>> https://github.com/ceph/ceph/pull/4983/files
>>
>> set-alloc-hint may be a problem, but I think a bigger problem is the
>> size of the write.  Are all those writes 512 bytes long?
>>
> In another test to format 2TB rbd device,
> there are :
> 2 messages,each write 131072 bytes
> 4000 messages, each write 262144 bytes
> 112 messages, each write 4096 bytes
> 194 messages, each write 512 bytes

So the majority of writes is not 512 bytes long.  I don't think
disabling set-alloc-hint (and, as of now at least, you can't disable it
anyway) would drastically change the numbers.  If you are doing mkfs
right after creating and mapping an image for the first time, you can
add -K option to mkfs, which will tell it to not try to discard.  As
for the write phase, I can't suggest anything off hand.

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux