Hi. As you mentioned, bluesotre_min_alloc_size can send data to the wal path. Performance is improved than wrtting directly SSDs. (more than 10KIOPS) However, performance of bluestore is lower than filestore (see below). I thinks that there are many performance options for bluestore. Therefore, i need to understand it in order to see real performance. (If you have recommended options, please let me know) Anyway, other observations are that 1. More than 70KIOPS is observed at first 20~30 second during performance test. after 20~30 second, performance is drop significantly. (one expected reason is that meta data size (blob map, extent map) is increasing) 2. high latency (note that this is nvram) Thanks. Bluestore (master branch, 6/29, No configurations are changed) ################################### BW result IO_Size randwrite 4KB 234.42 ################################### IOPS result IO_Size randwrite 4KB 60006 ################################### Latency result IO_Size randwrite (ms) 4KB 9.595 ################################### CPU utilization IO_Size randwrite 4KB 52.53 Filestore (jewel, 10.2.1, No configurations are changed) ################################### BW result IO_Size randwrite 4KB 260.33 ################################### IOPS result IO_Size randwrite 4KB 66640 ################################### Latency result IO_Size randwrite (ms) 4KB 8.642 ################################### CPU utilization IO_Size randwrite 4KB 56.42 2016-06-27 21:31 GMT+09:00 Sage Weil <sage@xxxxxxxxxxxx>: > On Mon, 27 Jun 2016, myoungwon oh wrote: >> Hi, I have questions for bluestore (4K random write case). >> >> So far, we have used NVRAM(PCIe) as journal and SSD (SATA) as data >> disk (filestore). >> Therefore, we got performance gain from NVRAM journal. >> However, current Bluestore design seems that data (4K aligned) is >> written to data disk first, then metadata is written to WAL rocksdb. >> This design can remove “double write” in objectstore, but in our case, >> NVRAM can not be utilized fully. >> >> So, my questions are that >> >> 1. Can bluestore write WAL first as filestore? > > You can do it indirectly with bluestore_min_alloc_size=65536, which will > send anything smaller than this value through the wal path. Please let > us know what effect this has on our latency/performance! > >> 2. If not, using bcache or flashcache for NVRAM on top of SSDs is right >> answer? > > This is also possible, but I expect we'd like to make this work out of the > box if we can! > > sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html