Re: CephFS and many small files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 1, 2019 at 4:04 AM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:
>
> There are no problems with mixed bluestore_min_alloc_size; that's an
> abstraction layer lower than the concept of multiple OSDs. (Also, you
> always have that when mixing SSDs and HDDs)
>
> I'm not sure about the real-world impacts of a lower min alloc size or
> the rationale behind the default values for HDDs (64) and SSDs (16kb).

The min_alloc_size in BlueStore controls which IO requests allocate
new space versus getting data-journaled in the WAL and then
read-modify-write'n over an existing block on disk.

Hard drives have the size set higher because doing the random IOs are
more expensive for them compared to cost of streaming the data out
twice.
-Greg

>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
> On Mon, Apr 1, 2019 at 10:36 AM Clausen, Jörn <jclausen@xxxxxxxxx> wrote:
> >
> > Hi Paul!
> >
> > Thanks for your answer. Yep, bluestore_min_alloc_size and your
> > calculation sounds very reasonable to me :)
> >
> > Am 29.03.2019 um 23:56 schrieb Paul Emmerich:
> > > Are you running on HDDs? The minimum allocation size is 64kb by
> > > default here. You can control that via the parameter
> > > bluestore_min_alloc_size during OSD creation.
> > > 64 kb times 8 million files is 512 GB which is the amount of usable
> > > space you reported before running the test, so that seems to add up.
> >
> > My test cluster is virtualized on vSphere, but the OSDs are reported as
> > HDDs. And our production cluster also uses HDDs only. All OSDs use the
> > default value for bluestore_min_alloc_size.
> >
> > If we should really consider tinkering with bluestore_min_alloc_size: As
> > this is probably not tunable afterwards, we would need to replace all
> > OSDs in a rolling update. Should we expect any problems while we have
> > OSDs with mixed min_alloc_sizes?
> >
> > > There's also some metadata overhead etc. You might want to consider
> > > enabling inline data in cephfs to handle small files in a
> > > store-efficient way (note that this feature is officially marked as
> > > experimental, though).
> > > http://docs.ceph.com/docs/master/cephfs/experimental-features/#inline-data
> >
> > I'll give it a try on my test cluster.
> >
> > --
> > Jörn Clausen
> > Daten- und Rechenzentrum
> > GEOMAR Helmholtz-Zentrum für Ozeanforschung Kiel
> > Düsternbrookerweg 20
> > 24105 Kiel
> >
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux