Re: RBD performance - tuning hints / parameter doc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ah, those are just min and max.  Sync is also triggered when the
journal hits the half-full mark.  We could make the percentage
configurable in the future.
-Sam

On Thu, Aug 30, 2012 at 8:08 AM, Dieter Kasper <d.kasper@xxxxxxxxxxxx> wrote:
> Samuel,
>
> thank you very much for this explicitely description!
>
> As far as I understand the journal acts as a ringbuffer in front of the OSD.
> Using time as a parameter to trigger sync might not be the best for
> a dynamic Storage subsystem. On a high workload e.g. 10/20 for min/max
> might be optimal for for 4 nodes with 10 OSDs each,
> but not after adding 4 additional nodes.
>
> Are there parameters to trigger the syncs to OSD
> in relation to the fill grade of the journal ?
> e.g.
> filestore [min|max] sync percent:
>
> Do not sync before min-% full; sync after max-% full
>
> What would happen if I set "filestore [min|max] sync interval" to 999999 ?
> Will the journal sync start at 100% full or at X% ?
> What is 'X' by defaut ?
> How can I set 'X' ?
>
> Best Regards,
> -Dieter
>
>
> On Thu, Aug 30, 2012 at 12:34:43AM +0200, Samuel Just wrote:
>> filestore [min|max] sync interval:
>>
>> Periodically, the filestore needs to quiesce writes and do a syncfs in
>> order to create
>> a consistent commit point up to which it can free journal entries.  Syncing more
>> frequently tends to reduce the time required to do the sync, and
>> reduces the amount
>> of data that needs to remain in the journal.  Less frequent syncs
>> would allow the
>> backing filesystem to better coalesce small writes and metadata
>> updates hopefully
>> resulting in more efficient syncs.  'filestore max sync interval'
>> defines the maximum
>> time period between syncs, 'filestore min sync interval' defines the
>> minimum time
>> period between syncs.
>>
>> filestore flusher:
>>
>> The filestore flusher forces data from large writes to be written out
>> using sync_file_range
>> before the sync in order to (hopefully) reduce the cost of the
>> eventual sync.  In practice,
>> disabling 'filestore flusher' seems to improve performance in some cases.
>>
>> filestore queue max ops:
>>
>> 'filestore queue max ops' defines the number of in progress ops the
>> filestore will accept
>> before blocking on queueing new ones.  This mostly shouldn't have much
>> of an effect
>> on performance and should probably be ignored.
>>
>> filestore op threads:
>>
>> 'filestore op threads' defines the number of threads used to submit
>> filesystem operations
>> in parallel.
>>
>> journal dio:
>>
>> 'journal dio' enables using O_DIRECT for writing to the journal.  This
>> should usually
>> be enabled.  If possible, 'journal aio' should also be enabled to
>> allow use of libaio
>> to do asynchronous writes.
>>
>> osd op threads:
>>
>> 'osd op threads' defines the size of the thread pool used to service
>> OSD operations
>> such as client requests.  Increasing this may increase the rate of
>> request processing.
>>
>> osd disk threads:
>>
>> 'osd disk threads' defines the number of threads used to perform background disk
>> intensive osd operations such as scrubbing and snap trimming.
>>
>> On Wed, Aug 29, 2012 at 12:29 PM, Dieter Kasper <d.kasper@xxxxxxxxxxxx> wrote:
>> > Hi Josh,
>> >
>> > thanks for the hint.
>> > Can you please spend a view words about the meaing of these parameters ?
>> > - filestore min/max sync interval =     int/float ?     seconds ? of what ?
>> > - filestore flusher = false
>> > - filestore queue max ops = 10000
>> >         what is 'one op' ?      queue in front of what ?
>> > - filestore op threads =
>> >         what are useful values here ?
>> >
>> > - journal dio = true/false
>> > - osd op threads =
>> > - osd disk threads =
>> >
>> >
>> > Kind Regards,
>> > -Dieter
>> >
>> >
>> > On Wed, Aug 29, 2012 at 07:37:36PM +0200, Josh Durgin wrote:
>> >> On 08/29/2012 01:50 AM, Alexandre DERUMIER wrote:
>> >> > Nice results !
>> >> > (can you make same benchmark from a qemu-kvm guest with virtio-driver ?
>> >> > I have made some bench some month ago with stephan priebe, and we never be able to have more than 20000iops, with a full ssd 3nodes cluster)
>> >> >
>> >> >>> How can I set the variables when the Journal data have go to the OSD ? (after X seconds and/or when Y %-full)
>> >> > I think you can try to tune these values
>> >> >
>> >> > filestore max sync interval = 30
>> >> > filestore min sync interval = 29
>> >> > filestore flusher = false
>> >> > filestore queue max ops = 10000
>> >>
>> >> Increasing filestore_op_threads might help as well.
>> >>
>> >> > ----- Mail original -----
>> >> >
>> >> > De: "Dieter Kasper" <d.kasper@xxxxxxxxxxxx>
>> >> > À: ceph-devel@xxxxxxxxxxxxxxx
>> >> > Cc: "Dieter Kasper (KD)" <d.kasper@xxxxxxxxxxxx>
>> >> > Envoyé: Mardi 28 Août 2012 19:48:42
>> >> > Objet: RBD performance - tuning hints
>> >> >
>> >> > Hi,
>> >> >
>> >> > on my 4-node system (SSD + 10GbE, see bench-config.txt for details)
>> >> > I can observe a pretty nice rados bench performance
>> >> > (see bench-rados.txt for details):
>> >> >
>> >> > Bandwidth (MB/sec): 961.710
>> >> > Max bandwidth (MB/sec): 1040
>> >> > Min bandwidth (MB/sec): 772
>> >> >
>> >> >
>> >> > Also the bandwidth performance generated with
>> >> > fio --filename=/dev/rbd1 --direct=1 --rw=$io --bs=$bs --size=2G --iodepth=$threads --ioengine=libaio --runtime=60 --group_reporting --name=file1 --output=fio_${io}_${bs}_${threads}
>> >> >
>> >> > .... is acceptable, e.g.
>> >> > fio_write_4m_16 795 MB/s
>> >> > fio_randwrite_8m_128 717 MB/s
>> >> > fio_randwrite_8m_16 714 MB/s
>> >> > fio_randwrite_2m_32 692 MB/s
>> >> >
>> >> >
>> >> > But, the write IOPS seems to be limited around 19k ...
>> >> > RBD 4M 64k (= optimal_io_size)
>> >> > fio_randread_512_128 53286 55925
>> >> > fio_randread_4k_128 51110 44382
>> >> > fio_randread_8k_128 30854 29938
>> >> > fio_randwrite_512_128 18888 2386
>> >> > fio_randwrite_512_64 18844 2582
>> >> > fio_randwrite_8k_64 17350 2445
>> >> > (...)
>> >> > fio_read_4k_128 10073 53151
>> >> > fio_read_4k_64 9500 39757
>> >> > fio_read_4k_32 9220 23650
>> >> > (...)
>> >> > fio_read_4k_16 9122 14322
>> >> > fio_write_4k_128 2190 14306
>> >> > fio_read_8k_32 706 13894
>> >> > fio_write_4k_64 2197 12297
>> >> > fio_write_8k_64 3563 11705
>> >> > fio_write_8k_128 3444 11219
>> >> >
>> >> >
>> >> > Any hints for tuning the IOPS (read and/or write) would be appreciated.
>> >> >
>> >> > How can I set the variables when the Journal data have go to the OSD ? (after X seconds and/or when Y %-full)
>> >> >
>> >> >
>> >> > Kind Regards,
>> >> > -Dieter
>> >> >
>> >> >
>> >> >
>> >>
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> > the body of a message to majordomo@xxxxxxxxxxxxxxx
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux