Re: inline_data (was: CephFS and many small files)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 2, 2019 at 9:10 PM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:
>
> On Tue, Apr 2, 2019 at 3:05 PM Yan, Zheng <ukernel@xxxxxxxxx> wrote:
> >
> > On Tue, Apr 2, 2019 at 8:23 PM Clausen, Jörn <jclausen@xxxxxxxxx> wrote:
> > >
> > > Hi!
> > >
> > > Am 29.03.2019 um 23:56 schrieb Paul Emmerich:
> > > > There's also some metadata overhead etc. You might want to consider
> > > > enabling inline data in cephfs to handle small files in a
> > > > store-efficient way (note that this feature is officially marked as
> > > > experimental, though).
> > > > http://docs.ceph.com/docs/master/cephfs/experimental-features/#inline-data
> > >
> > > Is there something missing from the documentation? I have turned on this
> > > feature:
> > >
> >
> > I don't use this feature.  We don't have plan to mark this feature
> > stable. (probably we will remove this feature in the furthure).
>
> We also don't use this feature in any of our production clusters
> (because it's marked experimental).
>
> But it seems like a really useful feature and I know of at least one
> real-world production cluster using this with great success...
> So why remove it?
>

mds needs to serve both data/metadata requests. It only suites for
small amount of data.

>
> Paul
>
> >
> > Yan, Zheng
> >
> >
> >
> > > $ ceph fs dump | grep inline_data
> > > dumped fsmap epoch 1224
> > > inline_data     enabled
> > >
> > > I have reduced the size of the bonnie-generated files to 1 byte. But
> > > this is the situation halfway into the test: (output slightly shortened)
> > >
> > > $ rados df
> > > POOL_NAME      USED OBJECTS CLONES   COPIES
> > > fs-data     3.2 MiB 3390041      0 10170123
> > > fs-metadata 772 MiB    2249      0     6747
> > >
> > > total_objects    3392290
> > > total_used       643 GiB
> > > total_avail      957 GiB
> > > total_space      1.6 TiB
> > >
> > > i.e. bonnie has created a little over 3 million files, for which the
> > > same number of objects was created in the data pool. So the raw usage is
> > > again at more than 500 GB.
> > >
> > > If the data was inlined, I would expect far less objects in the data
> > > pool - actually none at all - and maybe some more usage in the metadata
> > > pool.
> > >
> > > Do I have to restart any daemons after turning on inline_data? Am I
> > > missing anything else here?
> > >
> > > For the record:
> > >
> > > $ ceph versions
> > > {
> > >      "mon": {
> > >          "ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc)
> > > nautilus (stable)": 3
> > >      },
> > >      "mgr": {
> > >          "ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc)
> > > nautilus (stable)": 3
> > >      },
> > >      "osd": {
> > >          "ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc)
> > > nautilus (stable)": 16
> > >      },
> > >      "mds": {
> > >          "ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc)
> > > nautilus (stable)": 2
> > >      },
> > >      "overall": {
> > >          "ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc)
> > > nautilus (stable)": 24
> > >      }
> > > }
> > >
> > > --
> > > Jörn Clausen
> > > Daten- und Rechenzentrum
> > > GEOMAR Helmholtz-Zentrum für Ozeanforschung Kiel
> > > Düsternbrookerweg 20
> > > 24105 Kiel
> > >
> > > _______________________________________________
> > > ceph-users mailing list
> > > ceph-users@xxxxxxxxxxxxxx
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux