Ah, those are just min and max. Sync is also triggered when the journal hits the half-full mark. We could make the percentage configurable in the future. -Sam On Thu, Aug 30, 2012 at 8:08 AM, Dieter Kasper <d.kasper@xxxxxxxxxxxx> wrote: > Samuel, > > thank you very much for this explicitely description! > > As far as I understand the journal acts as a ringbuffer in front of the OSD. > Using time as a parameter to trigger sync might not be the best for > a dynamic Storage subsystem. On a high workload e.g. 10/20 for min/max > might be optimal for for 4 nodes with 10 OSDs each, > but not after adding 4 additional nodes. > > Are there parameters to trigger the syncs to OSD > in relation to the fill grade of the journal ? > e.g. > filestore [min|max] sync percent: > > Do not sync before min-% full; sync after max-% full > > What would happen if I set "filestore [min|max] sync interval" to 999999 ? > Will the journal sync start at 100% full or at X% ? > What is 'X' by defaut ? > How can I set 'X' ? > > Best Regards, > -Dieter > > > On Thu, Aug 30, 2012 at 12:34:43AM +0200, Samuel Just wrote: >> filestore [min|max] sync interval: >> >> Periodically, the filestore needs to quiesce writes and do a syncfs in >> order to create >> a consistent commit point up to which it can free journal entries. Syncing more >> frequently tends to reduce the time required to do the sync, and >> reduces the amount >> of data that needs to remain in the journal. Less frequent syncs >> would allow the >> backing filesystem to better coalesce small writes and metadata >> updates hopefully >> resulting in more efficient syncs. 'filestore max sync interval' >> defines the maximum >> time period between syncs, 'filestore min sync interval' defines the >> minimum time >> period between syncs. >> >> filestore flusher: >> >> The filestore flusher forces data from large writes to be written out >> using sync_file_range >> before the sync in order to (hopefully) reduce the cost of the >> eventual sync. In practice, >> disabling 'filestore flusher' seems to improve performance in some cases. >> >> filestore queue max ops: >> >> 'filestore queue max ops' defines the number of in progress ops the >> filestore will accept >> before blocking on queueing new ones. This mostly shouldn't have much >> of an effect >> on performance and should probably be ignored. >> >> filestore op threads: >> >> 'filestore op threads' defines the number of threads used to submit >> filesystem operations >> in parallel. >> >> journal dio: >> >> 'journal dio' enables using O_DIRECT for writing to the journal. This >> should usually >> be enabled. If possible, 'journal aio' should also be enabled to >> allow use of libaio >> to do asynchronous writes. >> >> osd op threads: >> >> 'osd op threads' defines the size of the thread pool used to service >> OSD operations >> such as client requests. Increasing this may increase the rate of >> request processing. >> >> osd disk threads: >> >> 'osd disk threads' defines the number of threads used to perform background disk >> intensive osd operations such as scrubbing and snap trimming. >> >> On Wed, Aug 29, 2012 at 12:29 PM, Dieter Kasper <d.kasper@xxxxxxxxxxxx> wrote: >> > Hi Josh, >> > >> > thanks for the hint. >> > Can you please spend a view words about the meaing of these parameters ? >> > - filestore min/max sync interval = int/float ? seconds ? of what ? >> > - filestore flusher = false >> > - filestore queue max ops = 10000 >> > what is 'one op' ? queue in front of what ? >> > - filestore op threads = >> > what are useful values here ? >> > >> > - journal dio = true/false >> > - osd op threads = >> > - osd disk threads = >> > >> > >> > Kind Regards, >> > -Dieter >> > >> > >> > On Wed, Aug 29, 2012 at 07:37:36PM +0200, Josh Durgin wrote: >> >> On 08/29/2012 01:50 AM, Alexandre DERUMIER wrote: >> >> > Nice results ! >> >> > (can you make same benchmark from a qemu-kvm guest with virtio-driver ? >> >> > I have made some bench some month ago with stephan priebe, and we never be able to have more than 20000iops, with a full ssd 3nodes cluster) >> >> > >> >> >>> How can I set the variables when the Journal data have go to the OSD ? (after X seconds and/or when Y %-full) >> >> > I think you can try to tune these values >> >> > >> >> > filestore max sync interval = 30 >> >> > filestore min sync interval = 29 >> >> > filestore flusher = false >> >> > filestore queue max ops = 10000 >> >> >> >> Increasing filestore_op_threads might help as well. >> >> >> >> > ----- Mail original ----- >> >> > >> >> > De: "Dieter Kasper" <d.kasper@xxxxxxxxxxxx> >> >> > À: ceph-devel@xxxxxxxxxxxxxxx >> >> > Cc: "Dieter Kasper (KD)" <d.kasper@xxxxxxxxxxxx> >> >> > Envoyé: Mardi 28 Août 2012 19:48:42 >> >> > Objet: RBD performance - tuning hints >> >> > >> >> > Hi, >> >> > >> >> > on my 4-node system (SSD + 10GbE, see bench-config.txt for details) >> >> > I can observe a pretty nice rados bench performance >> >> > (see bench-rados.txt for details): >> >> > >> >> > Bandwidth (MB/sec): 961.710 >> >> > Max bandwidth (MB/sec): 1040 >> >> > Min bandwidth (MB/sec): 772 >> >> > >> >> > >> >> > Also the bandwidth performance generated with >> >> > fio --filename=/dev/rbd1 --direct=1 --rw=$io --bs=$bs --size=2G --iodepth=$threads --ioengine=libaio --runtime=60 --group_reporting --name=file1 --output=fio_${io}_${bs}_${threads} >> >> > >> >> > .... is acceptable, e.g. >> >> > fio_write_4m_16 795 MB/s >> >> > fio_randwrite_8m_128 717 MB/s >> >> > fio_randwrite_8m_16 714 MB/s >> >> > fio_randwrite_2m_32 692 MB/s >> >> > >> >> > >> >> > But, the write IOPS seems to be limited around 19k ... >> >> > RBD 4M 64k (= optimal_io_size) >> >> > fio_randread_512_128 53286 55925 >> >> > fio_randread_4k_128 51110 44382 >> >> > fio_randread_8k_128 30854 29938 >> >> > fio_randwrite_512_128 18888 2386 >> >> > fio_randwrite_512_64 18844 2582 >> >> > fio_randwrite_8k_64 17350 2445 >> >> > (...) >> >> > fio_read_4k_128 10073 53151 >> >> > fio_read_4k_64 9500 39757 >> >> > fio_read_4k_32 9220 23650 >> >> > (...) >> >> > fio_read_4k_16 9122 14322 >> >> > fio_write_4k_128 2190 14306 >> >> > fio_read_8k_32 706 13894 >> >> > fio_write_4k_64 2197 12297 >> >> > fio_write_8k_64 3563 11705 >> >> > fio_write_8k_128 3444 11219 >> >> > >> >> > >> >> > Any hints for tuning the IOPS (read and/or write) would be appreciated. >> >> > >> >> > How can I set the variables when the Journal data have go to the OSD ? (after X seconds and/or when Y %-full) >> >> > >> >> > >> >> > Kind Regards, >> >> > -Dieter >> >> > >> >> > >> >> > >> >> >> >> -- >> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > >> > -- >> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> > the body of a message to majordomo@xxxxxxxxxxxxxxx >> > More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html