RE: Questions for NVRAM+SATA SSDs with Bluestore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Could you please explain in detail about your test configuration  (like how many osds/ replication/ NVRAM used on where ) ? Also, how long you ran the test ?

Thanks & Regards
Somnath

-----Original Message-----
From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of myoungwon oh
Sent: Thursday, June 30, 2016 1:40 AM
To: Sage Weil
Cc: ceph-devel@xxxxxxxxxxxxxxx
Subject: Re: Questions for NVRAM+SATA SSDs with Bluestore

Hi.


As you mentioned, bluesotre_min_alloc_size can send data to the wal path.

Performance is improved than wrtting directly SSDs. (more than 10KIOPS)

However, performance of bluestore is lower than filestore (see below).

I thinks that there are many performance options for bluestore.

Therefore, i need to understand it in order to see real performance.

(If you have recommended options, please let me know)



Anyway, other observations are that

1. More than 70KIOPS is observed at first 20~30 second during performance test. after 20~30 second, performance is drop significantly.

    (one expected reason is that meta data size (blob map, extent map) is increasing)

2. high latency (note that this is nvram)



Thanks.



Bluestore (master branch, 6/29, No configurations are changed)

################################### BW result IO_Size  randwrite

4KB      234.42

################################### IOPS result IO_Size  randwrite

4KB      60006

################################### Latency result IO_Size  randwrite (ms)

4KB      9.595

################################### CPU utilization IO_Size  randwrite

4KB      52.53



Filestore (jewel, 10.2.1, No configurations are changed)


################################### BW result IO_Size  randwrite

4KB      260.33

################################### IOPS result IO_Size  randwrite

4KB      66640

################################### Latency result IO_Size  randwrite (ms)

4KB      8.642

################################### CPU utilization IO_Size  randwrite

4KB      56.42


2016-06-27 21:31 GMT+09:00 Sage Weil <sage@xxxxxxxxxxxx>:
> On Mon, 27 Jun 2016, myoungwon oh wrote:
>> Hi, I have questions for bluestore (4K random write case).
>>
>> So far, we have used NVRAM(PCIe) as journal and SSD (SATA) as data
>> disk (filestore).
>> Therefore, we got performance gain from NVRAM journal.
>> However, current Bluestore design seems that data (4K aligned) is
>> written to data disk first, then metadata is written to WAL rocksdb.
>> This design can remove “double write” in objectstore, but in our
>> case, NVRAM can not be utilized fully.
>>
>>  So, my questions are that
>>
>> 1. Can bluestore write WAL first as filestore?
>
> You can do it indirectly with bluestore_min_alloc_size=65536, which
> will send anything smaller than this value through the wal path.
> Please let us know what effect this has on our latency/performance!
>
>> 2. If not, using bcache or flashcache for NVRAM on top of SSDs is
>> right answer?
>
> This is also possible, but I expect we'd like to make this work out of
> the box if we can!
>
> sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at  http://vger.kernel.org/majordomo-info.html
PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux