On 09/21/2017 05:03 PM, Mark Nelson wrote: > > > On 09/21/2017 03:17 AM, Dietmar Rieder wrote: >> On 09/21/2017 09:45 AM, Maged Mokhtar wrote: >>> On 2017-09-21 07:56, Lazuardi Nasution wrote: >>> >>>> Hi, >>>> >>>> I'm still looking for the answer of these questions. Maybe someone can >>>> share their thought on these. Any comment will be helpful too. >>>> >>>> Best regards, >>>> >>>> On Sat, Sep 16, 2017 at 1:39 AM, Lazuardi Nasution >>>> <mrxlazuardin@xxxxxxxxx <mailto:mrxlazuardin@xxxxxxxxx>> wrote: >>>> >>>> Hi, >>>> >>>> 1. Is it possible configure use osd_data not as small partition on >>>> OSD but a folder (ex. on root disk)? If yes, how to do that with >>>> ceph-disk and any pros/cons of doing that? >>>> 2. Is WAL & DB size calculated based on OSD size or expected >>>> throughput like on journal device of filestore? If no, what is the >>>> default value and pro/cons of adjusting that? >>>> 3. Is partition alignment matter on Bluestore, including WAL & DB >>>> if using separate device for them? >>>> >>>> Best regards, >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> >>> I am also looking for recommendations on wal/db partition sizes. Some >>> hints: >>> >>> ceph-disk defaults used in case it does not find >>> bluestore_block_wal_size or bluestore_block_db_size in config file: >>> >>> wal = 512MB >>> >>> db = if bluestore_block_size (data size) is in config file it uses 1/100 >>> of it else it uses 1G. >>> >>> There is also a presentation by Sage back in March, see page 16: >>> >>> https://www.slideshare.net/sageweil1/bluestore-a-new-storage-backend-for-ceph-one-year-in >>> >>> >>> wal: 512 MB >>> >>> db: "a few" GB >>> >>> the wal size is probably not debatable, it will be like a journal for >>> small block sizes which are constrained by iops hence 512 MB is more >>> than enough. Probably we will see more on the db size in the future. >> >> This is what I understood so far. >> I wonder if it makes sense to set the db size as big as possible and >> divide entire db device is by the number of OSDs it will serve. >> >> E.g. 10 OSDs / 1 NVME (800GB) >> >> (800GB - 10x1GB wal ) / 10 = ~79Gb db size per OSD >> >> Is this smart/stupid? > > Personally I'd use 512MB-2GB for the WAL (larger buffers reduce write > amp but mean larger memtables and potentially higher overhead scanning > through memtables). 4x256MB buffers works pretty well, but it means > memory overhead too. Beyond that, I'd devote the entire rest of the > device to DB partitions. > thanks for your suggestion Mark! So, just to make sure I understood this right: You'd use a separeate 512MB-2GB WAL partition for each OSD and the entire rest for DB partitions. In the example case with 10xHDD OSD and 1 NVME it would then be 10 WAL partitions with each 512MB-2GB and 10 equal sized DB partitions consuming the rest of the NVME. Thanks Dietmar -- _________________________________________ D i e t m a r R i e d e r, Mag.Dr. Innsbruck Medical University Biocenter - Division for Bioinformatics
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com