On Thu, Aug 27, 2015 at 3:43 AM, huang jun <hjwsm1989@xxxxxxxxx> wrote: > hi,llya > > 2015-08-26 23:56 GMT+08:00 Ilya Dryomov <idryomov@xxxxxxxxx>: >> On Wed, Aug 26, 2015 at 6:22 PM, Haomai Wang <haomaiwang@xxxxxxxxx> wrote: >>> On Wed, Aug 26, 2015 at 11:16 PM, huang jun <hjwsm1989@xxxxxxxxx> wrote: >>>> hi,all >>>> we create a 2TB rbd image, after map it to local, >>>> then we format it to xfs with 'mkfs.xfs /dev/rbd0', it spent 318 >>>> seconds to finish, but local physical disk with the same size just >>>> need 6 seconds. >>>> >>> >>> I think librbd have two PR related to this. >>> >>>> After debug, we found there are two steps in rbd module during formating: >>>> a) send 233093 DELETE requests to osds(number_of_requests = 2TB / 4MB), >>>> this step spent almost 92 seconds. >>> >>> I guess this(https://github.com/ceph/ceph/pull/4221/files) may help >> >> It's submitting deletes for non-existent objects, not zeroing. The >> only thing that will really help here is the addition of rbd object map >> support to the kernel client. That could happen in 4.4, but 4.5 is >> a safer bet. >> >>> >>>> b) send 4238 messages like this: [set-alloc-hint object_size 4194304 >>>> write_size 4194304,write 0~512] to osds, that spent 227 seconds. >>> >>> I think kernel rbd also need to use >>> https://github.com/ceph/ceph/pull/4983/files >> >> set-alloc-hint may be a problem, but I think a bigger problem is the >> size of the write. Are all those writes 512 bytes long? >> > In another test to format 2TB rbd device, > there are : > 2 messages,each write 131072 bytes > 4000 messages, each write 262144 bytes > 112 messages, each write 4096 bytes > 194 messages, each write 512 bytes So the majority of writes is not 512 bytes long. I don't think disabling set-alloc-hint (and, as of now at least, you can't disable it anyway) would drastically change the numbers. If you are doing mkfs right after creating and mapping an image for the first time, you can add -K option to mkfs, which will tell it to not try to discard. As for the write phase, I can't suggest anything off hand. Thanks, Ilya -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html