Hi Ilya, > On 28 Nov 2014, at 17:56, Ilya Dryomov <ilya.dryomov@xxxxxxxxxxx> wrote: > > On Fri, Nov 28, 2014 at 5:46 PM, Dan Van Der Ster > <daniel.vanderster@xxxxxxx> wrote: >> Hi Andrei, >> Yes, I’m testing from within the guest. >> >> Here is an example. First, I do 2MB reads when the max_sectors_kb=512, and >> we see the reads are split into 4. (fio sees 25 iops, though iostat reports >> 100 smaller iops): >> >> # echo 512 > /sys/block/vdb/queue/max_sectors_kb # this is the default >> # fio --readonly --name /dev/vdb --rw=read --size=1G --ioengine=libaio >> --direct=1 --runtime=10s --blocksize=2m >> /dev/vdb: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=libaio, iodepth=1 >> fio-2.0.13 >> Starting 1 process >> Jobs: 1 (f=1): [R] [100.0% done] [51200K/0K/0K /s] [25 /0 /0 iops] [eta >> 00m:00s] >> >> meanwhile iostat is reporting 100 iops of average size 1024 sectors (i.e. >> 512kB): >> >> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz >> avgqu-sz await svctm %util >> vdb 0.00 0.00 100.00 0.00 50.00 0.00 1024.00 >> 3.02 30.25 10.00 100.00 >> >> >> >> Now increase the max_sectors_kb to 4MB, and the IOs are no longer split: >> >> # echo 4096 > /sys/block/vdb/queue/max_sectors_kb >> # fio --readonly --name /dev/vdb --rw=read --size=1G --ioengine=libaio >> --direct=1 --runtime=10s --blocksize=2m >> /dev/vdb: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=libaio, iodepth=1 >> fio-2.0.13 >> Starting 1 process >> Jobs: 1 (f=1): [R] [100.0% done] [200.0M/0K/0K /s] [100 /0 /0 iops] [eta >> 00m:00s] >> >> iostat reports 100 iops, 4096 sectors each read (i.e. 2MB): >> >> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz >> avgqu-sz await svctm %util >> vdb 300.00 0.00 100.00 0.00 200.00 0.00 4096.00 >> 0.99 9.94 9.94 99.40 > > We set the hard request size limit to rbd object size (4M typically) > > blk_queue_max_hw_sectors(q, segment_size / SECTOR_SIZE); > Are you referring to librbd or krbd? My observations are limited to librbd at the moment. (I didn’t try this on krbd). > but block layer then sets the soft limit for fs requests to 512K > > BLK_DEF_MAX_SECTORS = 1024, > > limits->max_sectors = min_t(unsigned int, max_hw_sectors, > BLK_DEF_MAX_SECTORS); > > which you are supposed to change on a per-device basis via sysfs. We > could probably raise the soft limit to rbd object size by default as > well - I don't see any harm in that. > Indeed, this patch which was being targeted for 3.19: https://lkml.org/lkml/2014/9/6/123 Cheers, Dan _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com