Re: Memory leak in Ceph OSD?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alex,

I can see your bug report: https://tracker.ceph.com/issues/23462

if your settings from there are applicable for your comment here then you have bluestore cache size limit set to 5 Gb that totals in 90 Gb RAM for  18 OSD for BlueStore cache only.

There is also additional memory overhead per OSD hence the amount of free memory you should expect isn't that much. If any at all...

Can you reduce bluestore cache size limits and check if out-of-memory  issue is still happening?


Thanks,

Igor


On 3/26/2018 5:09 PM, Alex Gorbachev wrote:
On Wed, Mar 21, 2018 at 2:26 PM, Kjetil Joergensen <kjetil@xxxxxxxxxxxx> wrote:
I retract my previous statement(s).

My current suspicion is that this isn't a leak as much as it being
load-driven, after enough waiting - it generally seems to settle around some
equilibrium. We do seem to sit on the mempools x 2.4 ~ ceph-osd RSS, which
is on the higher side (I see documentation alluding to expecting ~1.5x).

-KJ

On Mon, Mar 19, 2018 at 3:05 AM, Konstantin Shalygin <k0ste@xxxxxxxx> wrote:

We don't run compression as far as I know, so that wouldn't be it. We do
actually run a mix of bluestore & filestore - due to the rest of the
cluster predating a stable bluestore by some amount.


12.2.2 -> 12.2.4 at 2018/03/10: I don't see increase of memory usage. No
any compressions of course.



http://storage6.static.itmages.com/i/18/0319/h_1521453809_9131482_859b1fb0a5.png

I am seeing these entries under load - should be plenty of RAM on a
node with 128GB RAM and 18 OSDs

Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193331] winbindd
cpuset=/ mems_allowed=0-1
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193337] CPU: 3 PID:
3406 Comm: winbindd Not tainted 4.14.14-041414-generic #201801201219
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193338] Hardware name:
Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.2
03/04/2015
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193339] Call Trace:
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193347]  dump_stack+0x5c/0x85
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193351]  dump_header+0x94/0x229
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193355]  ?
do_try_to_free_pages+0x2a1/0x330
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193357]  ?
get_page_from_freelist+0xa3/0xb20
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193359]
oom_kill_process+0x213/0x410
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193361]
out_of_memory+0x2af/0x4d0
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193363]
__alloc_pages_slowpath+0xab2/0xe40
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193366]
__alloc_pages_nodemask+0x261/0x280
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193370]
filemap_fault+0x33f/0x6b0
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193373]  ?
filemap_map_pages+0x18a/0x3a0
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193376]
ext4_filemap_fault+0x2c/0x40
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193379]  __do_fault+0x19/0xe0
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193381]
__handle_mm_fault+0xcd6/0x1180
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193383]
handle_mm_fault+0xaa/0x1f0
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193387]
__do_page_fault+0x25d/0x4e0
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193391]  ? page_fault+0x36/0x60
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193393]  page_fault+0x4c/0x60
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193396] RIP: 0033:0x56443d3d1239
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193397] RSP:
002b:00007ffe6e44b3a0 EFLAGS: 00010246
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193399] Mem-Info:
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407]
active_anon:30843938 inactive_anon:1403277 isolated_anon:0
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407]
active_file:121 inactive_file:977 isolated_file:18
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407]
unevictable:3203 dirty:2 writeback:0 unstable:0
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407]
slab_reclaimable:51522 slab_unreclaimable:95924
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407]  mapped:2926
shmem:5220 pagetables:77204 bounce:0
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407]  free:328371
free_pcp:0 free_cma:0
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193411] Node 0
active_anon:61155956kB inactive_anon:3014752kB active_file:864kB
inactive_file:1432kB unevictable:10440kB isolated(anon):0kB
isolated(file):80kB mapped:7648kB dirty:0kB writeback:0kB
shmem:14460kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB
writeback_tmp:0kB unstable:0kB all_unreclaimable? no
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193414] Node 1
active_anon:62219796kB inactive_anon:2598356kB active_file:0kB
inactive_file:2476kB unevictable:2372kB isolated(anon):0kB
isolated(file):0kB mapped:4056kB dirty:8kB writeback:0kB shmem:6420kB
shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB
unstable:0kB all_unreclaimable? no
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193416] Node 0 DMA
free:15896kB min:124kB low:152kB high:180kB active_anon:0kB
inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB
writepending:0kB present:15980kB managed:15896kB mlocked:0kB
kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
free_cma:0kB
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193420]
lowmem_reserve[]: 0 1889 64319 64319 64319
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193424] Node 0 DMA32
free:265308kB min:15732kB low:19664kB high:23596kB
active_anon:1642352kB inactive_anon:63060kB active_file:0kB
inactive_file:0kB unevictable:0kB writepending:0kB present:2045868kB
managed:1980300kB mlocked:0kB kernel_stack:48kB pagetables:832kB
bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193428]
lowmem_reserve[]: 0 0 62430 62430 62430
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193432] Node 0 Normal
free:507908kB min:507928kB low:634908kB high:761888kB
active_anon:59513604kB inactive_anon:2951692kB active_file:732kB
inactive_file:1720kB unevictable:10440kB writepending:0kB
present:65011712kB managed:63934936kB mlocked:10440kB
kernel_stack:16392kB pagetables:164944kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193436]
lowmem_reserve[]: 0 0 0 0 0
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193440] Node 1 Normal
free:524372kB min:524784kB low:655980kB high:787176kB
active_anon:62219796kB inactive_anon:2598356kB active_file:504kB
inactive_file:1392kB unevictable:2372kB writepending:8kB
present:67108864kB managed:66056740kB mlocked:2372kB
kernel_stack:17912kB pagetables:143040kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193444]
lowmem_reserve[]: 0 0 0 0 0
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193447] Node 0 DMA:
0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U)
0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB
Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193459] Node 0 DMA32:
403*4kB (UME) 238*8kB (UME) 196*16kB (UME) 102*32kB (UME) 56*64kB
(UME) 24*128kB (UE) 25*256kB (UM) 11*512kB (UME) 4*1024kB (UE)
6*2048kB (UM) 54*4096kB (UM) = 266172kB




k



--
Kjetil Joergensen <kjetil@xxxxxxxxxxxx>
SRE, Medallia Inc

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux