kraken-bluestore 11.2.0 memory leak issue

jaylinuxgeek@xxxxxxxxx (Jay Linux) · Mon, 20 Feb 2017 11:18:10 +0530

Hello Shinobu,

We already raised ticket for this issue. FYI -
http://tracker.ceph.com/issues/18924

Thanks
Jayaram

On Mon, Feb 20, 2017 at 12:36 AM, Shinobu Kinjo <skinjo at redhat.com> wrote:

> Please open ticket at http://tracker.ceph.com, if you haven't yet.
>
> On Thu, Feb 16, 2017 at 6:07 PM, Muthusamy Muthiah
> <muthiah.muthusamy at gmail.com> wrote:
> > Hi Wido,
> >
> > Thanks for the information and let us know if this is a bug.
> > As workaround we will go with small bluestore_cache_size to 100MB.
> >
> > Thanks,
> > Muthu
> >
> > On 16 February 2017 at 14:04, Wido den Hollander <wido at 42on.com> wrote:
> >>
> >>
> >> > Op 16 februari 2017 om 7:19 schreef Muthusamy Muthiah
> >> > <muthiah.muthusamy at gmail.com>:
> >> >
> >> >
> >> > Thanks IIya Letkowski for the information we will change this value
> >> > accordingly.
> >> >
> >>
> >> What I understand from yesterday's performance meeting is that this
> seems
> >> like a bug. Lowering this buffer reduces memory, but the root-cause
> seems to
> >> be memory not being freed. A few bytes of a larger allocation still
> >> allocated causing this buffer not to be freed.
> >>
> >> Tried:
> >>
> >> debug_mempools = true
> >>
> >> $ ceph daemon osd.X dump_mempools
> >>
> >> Might want to view the YouTube video of yesterday when it's online:
> >> https://www.youtube.com/channel/UCno-Fry25FJ7B4RycCxOtfw/videos
> >>
> >> Wido
> >>
> >> > Thanks,
> >> > Muthu
> >> >
> >> > On 15 February 2017 at 17:03, Ilya Letkowski <mj12.svetzari at gmail.com
> >
> >> > wrote:
> >> >
> >> > > Hi, Muthusamy Muthiah
> >> > >
> >> > > I'm not totally sure that this is a memory leak.
> >> > > We had same problems with bluestore on ceph v11.2.0.
> >> > > Reduce bluestore cache helped us to solve it and stabilize OSD
> memory
> >> > > consumption on the 3GB level.
> >> > >
> >> > > Perhaps this will help you:
> >> > >
> >> > > bluestore_cache_size = 104857600
> >> > >
> >> > >
> >> > >
> >> > > On Tue, Feb 14, 2017 at 11:52 AM, Muthusamy Muthiah <
> >> > > muthiah.muthusamy at gmail.com> wrote:
> >> > >
> >> > >> Hi All,
> >> > >>
> >> > >> On all our 5 node cluster with ceph 11.2.0 we encounter memory leak
> >> > >> issues.
> >> > >>
> >> > >> Cluster details : 5 node with 24/68 disk per node , EC : 4+1 , RHEL
> >> > >> 7.2
> >> > >>
> >> > >> Some traces using sar are below and attached the memory utilisation
> >> > >> graph
> >> > >> .
> >> > >>
> >> > >> (16:54:42)[cn2.c1 sa] # sar -r
> >> > >> 07:50:01 kbmemfree kbmemused %memused kbbuffers kbcached kbcommit
> >> > >> %commit
> >> > >> kbactive kbinact kbdirty
> >> > >> 10:20:01 32077264 132754368 80.54 16176 3040244 77767024 47.18
> >> > >> 51991692
> >> > >> 2676468 260
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >> *10:30:01 32208384 132623248 80.46 16176 3048536 77832312 47.22
> >> > >> 51851512
> >> > >> 2684552 1210:40:01 32067244 132764388 80.55 16176 3059076 77832316
> >> > >> 47.22
> >> > >> 51983332 2694708 26410:50:01 30626144 134205488 81.42 16176 3064340
> >> > >> 78177232 47.43 53414144 2693712 411:00:01 28927656 135903976 82.45
> >> > >> 16176
> >> > >> 3074064 78958568 47.90 55114284 2702892 1211:10:01 27158548
> 137673084
> >> > >> 83.52
> >> > >> 16176 3080600 80553936 48.87 56873664 2708904 1211:20:01 26455556
> >> > >> 138376076
> >> > >> 83.95 16176 3080436 81991036 49.74 57570280 2708500 811:30:01
> >> > >> 26002252
> >> > >> 138829380 84.22 16176 3090556 82223840 49.88 58015048 2718036
> >> > >> 1611:40:01
> >> > >> 25965924 138865708 84.25 16176 3089708 83734584 50.80 58049980
> >> > >> 2716740
> >> > >> 1211:50:01 26142888 138688744 84.14 16176 3089544 83800100 50.84
> >> > >> 57869628
> >> > >> 2715400 16*
> >> > >>
> >> > >> ...
> >> > >> ...
> >> > >>
> >> > >> In the attached graph, there is increase in memory utilisation by
> >> > >> ceph-osd during soak test. And when it reaches the system limit of
> >> > >> 128GB
> >> > >> RAM , we could able to see the below dmesg logs related to memory
> out
> >> > >> when
> >> > >> the system reaches close to 128GB RAM. OSD.3 killed due to Out of
> >> > >> memory
> >> > >> and started again.
> >> > >>
> >> > >> [Tue Feb 14 03:51:02 2017] *tp_osd_tp invoked oom-killer:
> >> > >> gfp_mask=0x280da, order=0, oom_score_adj=0*
> >> > >> [Tue Feb 14 03:51:02 2017] tp_osd_tp cpuset=/ mems_allowed=0-1
> >> > >> [Tue Feb 14 03:51:02 2017] CPU: 20 PID: 11864 Comm: tp_osd_tp Not
> >> > >> tainted
> >> > >> 3.10.0-327.13.1.el7.x86_64 #1
> >> > >> [Tue Feb 14 03:51:02 2017] Hardware name: HP ProLiant XL420
> >> > >> Gen9/ProLiant
> >> > >> XL420 Gen9, BIOS U19 09/12/2016
> >> > >> [Tue Feb 14 03:51:02 2017]  ffff8819ccd7a280 0000000030e84036
> >> > >> ffff881fa58f7528 ffffffff816356f4
> >> > >> [Tue Feb 14 03:51:02 2017]  ffff881fa58f75b8 ffffffff8163068f
> >> > >> ffff881fa3478360 ffff881fa3478378
> >> > >> [Tue Feb 14 03:51:02 2017]  ffff881fa58f75e8 ffff8819ccd7a280
> >> > >> 0000000000000001 000000000001f65f
> >> > >> [Tue Feb 14 03:51:02 2017] Call Trace:
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff816356f4>]
> dump_stack+0x19/0x1b
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff8163068f>]
> >> > >> dump_header+0x8e/0x214
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff8116ce7e>]
> >> > >> oom_kill_process+0x24e/0x3b0
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff8116c9e6>] ?
> >> > >> find_lock_task_mm+0x56/0xc0
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff8116d6a6>]
> >> > >> *out_of_memory+0x4b6/0x4f0*
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff81173885>]
> >> > >> __alloc_pages_nodemask+0xa95/0xb90
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff811b792a>]
> >> > >> alloc_pages_vma+0x9a/0x140
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff811976c5>]
> >> > >> handle_mm_fault+0xb85/0xf50
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff811957fb>] ?
> >> > >> follow_page_mask+0xbb/0x5c0
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff81197c2b>]
> >> > >> __get_user_pages+0x19b/0x640
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff8119843d>]
> >> > >> get_user_pages_unlocked+0x15d/0x1f0
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff8106544f>]
> >> > >> get_user_pages_fast+0x9f/0x1a0
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff8121de78>]
> >> > >> do_blockdev_direct_IO+0x1a78/0x2610
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff81218c40>] ? I_BDEV+0x10/0x10
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff8121ea65>]
> >> > >> __blockdev_direct_IO+0x55/0x60
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff81218c40>] ? I_BDEV+0x10/0x10
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff81219297>]
> >> > >> blkdev_direct_IO+0x57/0x60
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff81218c40>] ? I_BDEV+0x10/0x10
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff8116af63>]
> >> > >> generic_file_aio_read+0x6d3/0x750
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffffa038ad5c>] ?
> >> > >> xfs_iunlock+0x11c/0x130 [xfs]
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff811690db>] ?
> >> > >> unlock_page+0x2b/0x30
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff81192f21>] ?
> >> > >> __do_fault+0x401/0x510
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff8121970c>]
> >> > >> blkdev_aio_read+0x4c/0x70
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff811ddcfd>]
> >> > >> do_sync_read+0x8d/0xd0
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff811de45c>]
> vfs_read+0x9c/0x170
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff811df182>]
> >> > >> SyS_pread64+0x92/0xc0
> >> > >> [Tue Feb 14 03:51:02 2017]  [<ffffffff81645e89>]
> >> > >> system_call_fastpath+0x16/0x1b
> >> > >>
> >> > >>
> >> > >> Feb 14 03:51:40 fr-paris kernel: *Out of memory: Kill process 7657
> >> > >> (ceph-osd) score 45 or sacrifice child*
> >> > >> Feb 14 03:51:40 fr-paris kernel: Killed process 7657 (ceph-osd)
> >> > >> total-vm:8650208kB, anon-rss:6124660kB, file-rss:1560kB
> >> > >> Feb 14 03:51:41 fr-paris systemd:* ceph-osd at 3.service: main
> process
> >> > >> exited, code=killed, status=9/KILL*
> >> > >> Feb 14 03:51:41 fr-paris systemd: Unit ceph-osd at 3.service entered
> >> > >> failed
> >> > >> state.
> >> > >> Feb 14 03:51:41 fr-paris systemd: *ceph-osd at 3.service failed.*
> >> > >> Feb 14 03:51:41 fr-paris systemd: cassandra.service: main process
> >> > >> exited,
> >> > >> code=killed, status=9/KILL
> >> > >> Feb 14 03:51:41 fr-paris systemd: Unit cassandra.service entered
> >> > >> failed
> >> > >> state.
> >> > >> Feb 14 03:51:41 fr-paris systemd: cassandra.service failed.
> >> > >> Feb 14 03:51:41 fr-paris ceph-mgr: 2017-02-14 03:51:41.978878
> >> > >> 7f51a3154700 -1 mgr ms_dispatch osd_map(7517..7517 src has
> >> > >> 6951..7517) v3
> >> > >> Feb 14 03:51:42 fr-paris systemd: Device
> >> > >> dev-disk-by\x2dpartlabel-ceph\x5cx20block.device
> >> > >> appeared twice with different sysfs paths
> >> > >> /sys/devices/pci0000:00/0000:0
> >> > >> 0:03.2/0000:03:00.0/host0/target0:0:0/0:0:0:9/block/sdj/sdj2 and
> >> > >> /sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/targ
> >> > >> et0:0:0/0:0:0:4/block/sde/sde2
> >> > >> Feb 14 03:51:42 fr-paris ceph-mgr: 2017-02-14 03:51:42.992477
> >> > >> 7f51a3154700 -1 mgr ms_dispatch osd_map(7518..7518 src has
> >> > >> 6951..7518) v3
> >> > >> Feb 14 03:51:43 fr-paris ceph-mgr: 2017-02-14 03:51:43.508990
> >> > >> 7f51a3154700 -1 mgr ms_dispatch mgrdigest v1
> >> > >> Feb 14 03:51:48 fr-paris ceph-mgr: 2017-02-14 03:51:48.508970
> >> > >> 7f51a3154700 -1 mgr ms_dispatch mgrdigest v1
> >> > >> Feb 14 03:51:53 fr-paris ceph-mgr: 2017-02-14 03:51:53.509592
> >> > >> 7f51a3154700 -1 mgr ms_dispatch mgrdigest v1
> >> > >> Feb 14 03:51:58 fr-paris ceph-mgr: 2017-02-14 03:51:58.509936
> >> > >> 7f51a3154700 -1 mgr ms_dispatch mgrdigest v1
> >> > >> Feb 14 03:52:01 fr-paris systemd: ceph-osd at 3.service holdoff time
> >> > >> over,
> >> > >> scheduling restart.
> >> > >> Feb 14 03:52:02 fr-paris systemd: *Starting Ceph object storage
> >> > >> daemon
> >> > >> osd.3.*..
> >> > >> Feb 14 03:52:02 fr-paris systemd: Started Ceph object storage
> daemon
> >> > >> osd.3.
> >> > >> Feb 14 03:52:02 fr-paris numactl: 2017-02-14 03:52:02.307106
> >> > >> 7f1e499bb940
> >> > >> -1 WARNING: the following dangerous and experimental features are
> >> > >> enabled:
> >> > >> bluestore,rocksdb
> >> > >> Feb 14 03:52:02 fr-paris numactl: 2017-02-14 03:52:02.317687
> >> > >> 7f1e499bb940
> >> > >> -1 WARNING: the following dangerous and experimental features are
> >> > >> enabled:
> >> > >> bluestore,rocksdb
> >> > >> Feb 14 03:52:02 fr-paris numactl: starting osd.3 at - osd_data
> >> > >> /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal
> >> > >> Feb 14 03:52:02 fr-paris numactl: 2017-02-14 03:52:02.333522
> >> > >> 7f1e499bb940
> >> > >> -1 WARNING: experimental feature 'bluestore' is enabled
> >> > >> Feb 14 03:52:02 fr-paris numactl: Please be aware that this feature
> >> > >> is
> >> > >> experimental, untested,
> >> > >> Feb 14 03:52:02 fr-paris numactl: unsupported, and may result in
> data
> >> > >> corruption, data loss,
> >> > >> Feb 14 03:52:02 fr-paris numactl: and/or irreparable damage to your
> >> > >> cluster.  Do not use
> >> > >> Feb 14 03:52:02 fr-paris numactl: feature with important data.
> >> > >>
> >> > >> This seems to happen only in 11.2.0 and not in 11.1.x . Could you
> >> > >> please
> >> > >> help us in resolving this issue by means of any config change to
> >> > >> limit the
> >> > >> memory use on ceph-osd or a bug in the current kraken release.
> >> > >>
> >> > >> Thanks,
> >> > >> Muthu
> >> > >>
> >> > >> _______________________________________________
> >> > >> ceph-users mailing list
> >> > >> ceph-users at lists.ceph.com
> >> > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> > >>
> >> > >>
> >> > >
> >> > >
> >> > > --
> >> > > ? ????????? / Best regards
> >> > >
> >> > > ???? ?????????? / Ilya Letkouski
> >> > >
> >> > > Phone, Viber: +375 29 3237335
> >> > >
> >> > > Minsk, Belarus (GMT+3)
> >> > >
> >> > _______________________________________________
> >> > ceph-users mailing list
> >> > ceph-users at lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users at lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20170220/ccbb78a5/attachment.htm>