Re: High mem with Luminous/Bluestore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 18 Oct 2017, Wido den Hollander wrote:
> > Op 18 oktober 2017 om 13:48 schreef Hans van den Bogert <hansbogert@xxxxxxxxx>:
> > 
> > 
> > Indeed it shows ssd in the OSD's metadata.
> > 
> >     "bluestore_bdev_type": "ssd",
> > 
> > 
> > Then I misunderstood the role of the device class in CRUSH, I expected the
> > OSD would actually set its settings according to the CRUSH device class.
> > 
> > I'll try to force the OSDs to behave like HDDs and monitor the memory usage.
> > 
> 
> The class in the HDD is only used by clients and the OSD doesn't use it 
> as a config reference.

Right.  The real relationship is that the OSD detects the device 
type and uses that to (1) automatically set the device class and 
(2) select which defaults to use.  If the user changes the device class 
that only affects CRUSH.
 
> You could just set 'bluestore_cache_size' to any value (bytes) you like.
> 
> But as Mark also replied, 8GB is still a lot of memory which BlueStore 
> is using, so that's odd.

There was a bug in 12.2.0 and .1 with bluestore's memory accounting--it 
wasn't including the object attrs, leading to the OSD overshooting it's 
memory target by a wide margin.  You can tell if this problem is 
affecting you by looking at 'ceph daemon osd.N dump_mempools' and checking 
for a large value for 'buffer_anon'... it is normally a few 10s of MB at 
most, but with the bug can be up to a couple GB.

The latest luminous branch fixes it, and 12.2.2 will include the fix.

sage


> 
> > 
> > Thanks,
> > 
> > Hans
> > 
> > On Wed, Oct 18, 2017 at 11:56 AM, Wido den Hollander <wido@xxxxxxxx> wrote:
> > 
> > >
> > > > Op 18 oktober 2017 om 11:41 schreef Hans van den Bogert <
> > > hansbogert@xxxxxxxxx>:
> > > >
> > > >
> > > > Hi All,
> > > >
> > > > I've converted 2 nodes with 4 HDD/OSDs each from Filestore to Bluestore.
> > > I
> > > > expected somewhat higher memory usage/RSS values, however I see, imo, a
> > > > huge memory usage for all OSDs on both nodes.
> > > >
> > > > Small snippet from `top`
> > > >     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
> > > > COMMAND
> > > >     4652 ceph      20   0 9840236 8.443g  21364 S   0.7 27.1  31:21.15
> > > > /usr/bin/ceph-osd -f --cluster ceph --id 5 --setuser ceph --setgroup ceph
> > > >
> > > >
> > > > The only deviation from a conventional install is that we use bcache for
> > > > our HDDs. Bcache by default is recognized as an 'ssd' in CRUSH. I've
> > > > manually set the class to 'hdd'.
> > > >
> > > > Small snippet from `ceph osd tree`
> > > >       -3        7.27399     host osd02
> > > >      5   hdd  1.81850         osd.5      up  1.00000 1.00000
> > > >
> > > > So I would expect around 2GB of usage according to rules of thumb in the
> > > > documentation and Sage's comments about the bluestore cache parameters
> > > for
> > > > HDDs; yet we're now seeing a usage of more than 8GB after less than 1 day
> > > > of runtime for this OSD. Is this a memory leak?
> > >
> > > Although you've set the class to HDD it's the OSD which probably sees
> > > itself as an SSD backed OSD.
> > >
> > > Test with:
> > >
> > > $ ceph osd metadata 5
> > >
> > > It will show:
> > >
> > > "bluestore_bdev_rotational": "0",
> > > "bluestore_bdev_type": "ssd",
> > >
> > > The default for SSD OSDs is 3GB, see: http://docs.ceph.com/docs/
> > > master/rados/configuration/bluestore-config-ref/
> > >
> > > bluestore_cache_size_ssd is set to 3GB, so it will use at least 3GB.
> > >
> > > I agree, 5GB above the 3GB is a lot of memory, but could you check the OSD
> > > metadata first?
> > >
> > > >
> > > > Having read the other threads Sage recommends to also send the mempool
> > > dump
> > > >
> > > > {
> > > >     "bloom_filter": {
> > > >         "items": 0,
> > > >         "bytes": 0
> > > >     },
> > > >     "bluestore_alloc": {
> > > >         "items": 5732656,
> > > >         "bytes": 5732656
> > > >     },
> > > >     "bluestore_cache_data": {
> > > >         "items": 10659,
> > > >         "bytes": 481820672
> > > >     },
> > > >     "bluestore_cache_onode": {
> > > >         "items": 1106714,
> > > >         "bytes": 752565520
> > > >     },
> > > >     "bluestore_cache_other": {
> > > >         "items": 412675997,
> > > >         "bytes": 1388849420
> > > >     },
> > > >     "bluestore_fsck": {
> > > >         "items": 0,
> > > >         "bytes": 0
> > > >     },
> > > >     "bluestore_txc": {
> > > >         "items": 5,
> > > >         "bytes": 3600
> > > >     },
> > > >     "bluestore_writing_deferred": {
> > > >         "items": 21,
> > > >         "bytes": 225280
> > > >     },
> > > >     "bluestore_writing": {
> > > >         "items": 2,
> > > >         "bytes": 188146
> > > >     },
> > > >     "bluefs": {
> > > >         "items": 951,
> > > >         "bytes": 50432
> > > >     },
> > > >     "buffer_anon": {
> > > >         "items": 14440810,
> > > >         "bytes": 1804695070
> > > >     },
> > > >     "buffer_meta": {
> > > >         "items": 10754,
> > > >         "bytes": 946352
> > > >     },
> > > >     "osd": {
> > > >         "items": 155,
> > > >         "bytes": 1869920
> > > >     },
> > > >     "osd_mapbl": {
> > > >         "items": 16,
> > > >         "bytes": 288280
> > > >     },
> > > >     "osd_pglog": {
> > > >         "items": 284680,
> > > >         "bytes": 91233440
> > > >     },
> > > >     "osdmap": {
> > > >         "items": 14287,
> > > >         "bytes": 731680
> > > >     },
> > > >     "osdmap_mapping": {
> > > >         "items": 0,
> > > >         "bytes": 0
> > > >     },
> > > >     "pgmap": {
> > > >         "items": 0,
> > > >         "bytes": 0
> > > >     },
> > > >     "mds_co": {
> > > >         "items": 0,
> > > >         "bytes": 0
> > > >     },
> > > >     "unittest_1": {
> > > >         "items": 0,
> > > >         "bytes": 0
> > > >     },
> > > >     "unittest_2": {
> > > >         "items": 0,
> > > >         "bytes": 0
> > > >     },
> > > >     "total": {
> > > >         "items": 434277707,
> > > >         "bytes": 4529200468
> > > >     }
> > > > }
> > > >
> > > > Regards,
> > > >
> > > > Hans
> > > > _______________________________________________
> > > > ceph-users mailing list
> > > > ceph-users@xxxxxxxxxxxxxx
> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux