Re: High mem with Luminous/Bluestore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Memory usage is still quite high here even with a large onode cache! Are you using erasure coding? I recently was able to reproduce a bug in bluestore causing excessive memory usage during large writes with EC, but have not tracked down exactly what's going on yet.

Mark

On 10/18/2017 06:48 AM, Hans van den Bogert wrote:
Indeed it shows ssd in the OSD's metadata.

    "bluestore_bdev_type": "ssd",


Then I misunderstood the role of the device class in CRUSH, I expected
the OSD would actually set its settings according to the CRUSH device class.

I'll try to force the OSDs to behave like HDDs and monitor the memory usage.


Thanks,

Hans

On Wed, Oct 18, 2017 at 11:56 AM, Wido den Hollander <wido@xxxxxxxx
<mailto:wido@xxxxxxxx>> wrote:


    > Op 18 oktober 2017 om 11:41 schreef Hans van den Bogert
    <hansbogert@xxxxxxxxx <mailto:hansbogert@xxxxxxxxx>>:
    >
    >
    > Hi All,
    >
    > I've converted 2 nodes with 4 HDD/OSDs each from Filestore to Bluestore. I
    > expected somewhat higher memory usage/RSS values, however I see, imo, a
    > huge memory usage for all OSDs on both nodes.
    >
    > Small snippet from `top`
    >     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
    > COMMAND
    >     4652 ceph      20   0 9840236 8.443g  21364 S   0.7 27.1  31:21.15
    > /usr/bin/ceph-osd -f --cluster ceph --id 5 --setuser ceph --setgroup ceph
    >
    >
    > The only deviation from a conventional install is that we use bcache for
    > our HDDs. Bcache by default is recognized as an 'ssd' in CRUSH. I've
    > manually set the class to 'hdd'.
    >
    > Small snippet from `ceph osd tree`
    >       -3        7.27399     host osd02
    >      5   hdd  1.81850         osd.5      up  1.00000 1.00000
    >
    > So I would expect around 2GB of usage according to rules of thumb in the
    > documentation and Sage's comments about the bluestore cache parameters for
    > HDDs; yet we're now seeing a usage of more than 8GB after less than 1 day
    > of runtime for this OSD. Is this a memory leak?

    Although you've set the class to HDD it's the OSD which probably
    sees itself as an SSD backed OSD.

    Test with:

    $ ceph osd metadata 5

    It will show:

    "bluestore_bdev_rotational": "0",
    "bluestore_bdev_type": "ssd",

    The default for SSD OSDs is 3GB, see:
    http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/
    <http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/>

    bluestore_cache_size_ssd is set to 3GB, so it will use at least 3GB.

    I agree, 5GB above the 3GB is a lot of memory, but could you check
    the OSD metadata first?

    >
    > Having read the other threads Sage recommends to also send the
    mempool dump
    >
    > {
    >     "bloom_filter": {
    >         "items": 0,
    >         "bytes": 0
    >     },
    >     "bluestore_alloc": {
    >         "items": 5732656,
    >         "bytes": 5732656
    >     },
    >     "bluestore_cache_data": {
    >         "items": 10659,
    >         "bytes": 481820672
    >     },
    >     "bluestore_cache_onode": {
    >         "items": 1106714,
    >         "bytes": 752565520
    >     },
    >     "bluestore_cache_other": {
    >         "items": 412675997,
    >         "bytes": 1388849420
    >     },
    >     "bluestore_fsck": {
    >         "items": 0,
    >         "bytes": 0
    >     },
    >     "bluestore_txc": {
    >         "items": 5,
    >         "bytes": 3600
    >     },
    >     "bluestore_writing_deferred": {
    >         "items": 21,
    >         "bytes": 225280
    >     },
    >     "bluestore_writing": {
    >         "items": 2,
    >         "bytes": 188146
    >     },
    >     "bluefs": {
    >         "items": 951,
    >         "bytes": 50432
    >     },
    >     "buffer_anon": {
    >         "items": 14440810,
    >         "bytes": 1804695070
    >     },
    >     "buffer_meta": {
    >         "items": 10754,
    >         "bytes": 946352
    >     },
    >     "osd": {
    >         "items": 155,
    >         "bytes": 1869920
    >     },
    >     "osd_mapbl": {
    >         "items": 16,
    >         "bytes": 288280
    >     },
    >     "osd_pglog": {
    >         "items": 284680,
    >         "bytes": 91233440
    >     },
    >     "osdmap": {
    >         "items": 14287,
    >         "bytes": 731680
    >     },
    >     "osdmap_mapping": {
    >         "items": 0,
    >         "bytes": 0
    >     },
    >     "pgmap": {
    >         "items": 0,
    >         "bytes": 0
    >     },
    >     "mds_co": {
    >         "items": 0,
    >         "bytes": 0
    >     },
    >     "unittest_1": {
    >         "items": 0,
    >         "bytes": 0
    >     },
    >     "unittest_2": {
    >         "items": 0,
    >         "bytes": 0
    >     },
    >     "total": {
    >         "items": 434277707,
    >         "bytes": 4529200468
    >     }
    > }
    >
    > Regards,
    >
    > Hans
    > _______________________________________________
    > ceph-users mailing list
    > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
    > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
    <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux