On Mon, Mar 26, 2018 at 3:08 PM, Igor Fedotov <ifedotov@xxxxxxx> wrote: > Hi Alex, > > I can see your bug report: https://tracker.ceph.com/issues/23462 > > if your settings from there are applicable for your comment here then you > have bluestore cache size limit set to 5 Gb that totals in 90 Gb RAM for 18 > OSD for BlueStore cache only. > > There is also additional memory overhead per OSD hence the amount of free > memory you should expect isn't that much. If any at all... > > Can you reduce bluestore cache size limits and check if out-of-memory issue > is still happening? > Thank you Igor, reducing to 3GB now and will advise. I did not realize there's additional memory on top of the 90GB, the nodes each have 128 GB. -- Alex Gorbachev Storcium > > Thanks, > > Igor > > > > On 3/26/2018 5:09 PM, Alex Gorbachev wrote: >> >> On Wed, Mar 21, 2018 at 2:26 PM, Kjetil Joergensen <kjetil@xxxxxxxxxxxx> >> wrote: >>> >>> I retract my previous statement(s). >>> >>> My current suspicion is that this isn't a leak as much as it being >>> load-driven, after enough waiting - it generally seems to settle around >>> some >>> equilibrium. We do seem to sit on the mempools x 2.4 ~ ceph-osd RSS, >>> which >>> is on the higher side (I see documentation alluding to expecting ~1.5x). >>> >>> -KJ >>> >>> On Mon, Mar 19, 2018 at 3:05 AM, Konstantin Shalygin <k0ste@xxxxxxxx> >>> wrote: >>>> >>>> >>>>> We don't run compression as far as I know, so that wouldn't be it. We >>>>> do >>>>> actually run a mix of bluestore & filestore - due to the rest of the >>>>> cluster predating a stable bluestore by some amount. >>>> >>>> >>>> >>>> 12.2.2 -> 12.2.4 at 2018/03/10: I don't see increase of memory usage. No >>>> any compressions of course. >>>> >>>> >>>> >>>> >>>> http://storage6.static.itmages.com/i/18/0319/h_1521453809_9131482_859b1fb0a5.png >>>> >> I am seeing these entries under load - should be plenty of RAM on a >> node with 128GB RAM and 18 OSDs >> >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193331] winbindd >> cpuset=/ mems_allowed=0-1 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193337] CPU: 3 PID: >> 3406 Comm: winbindd Not tainted 4.14.14-041414-generic #201801201219 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193338] Hardware name: >> Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.2 >> 03/04/2015 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193339] Call Trace: >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193347] >> dump_stack+0x5c/0x85 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193351] >> dump_header+0x94/0x229 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193355] ? >> do_try_to_free_pages+0x2a1/0x330 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193357] ? >> get_page_from_freelist+0xa3/0xb20 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193359] >> oom_kill_process+0x213/0x410 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193361] >> out_of_memory+0x2af/0x4d0 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193363] >> __alloc_pages_slowpath+0xab2/0xe40 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193366] >> __alloc_pages_nodemask+0x261/0x280 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193370] >> filemap_fault+0x33f/0x6b0 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193373] ? >> filemap_map_pages+0x18a/0x3a0 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193376] >> ext4_filemap_fault+0x2c/0x40 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193379] >> __do_fault+0x19/0xe0 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193381] >> __handle_mm_fault+0xcd6/0x1180 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193383] >> handle_mm_fault+0xaa/0x1f0 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193387] >> __do_page_fault+0x25d/0x4e0 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193391] ? >> page_fault+0x36/0x60 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193393] >> page_fault+0x4c/0x60 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193396] RIP: >> 0033:0x56443d3d1239 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193397] RSP: >> 002b:00007ffe6e44b3a0 EFLAGS: 00010246 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193399] Mem-Info: >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407] >> active_anon:30843938 inactive_anon:1403277 isolated_anon:0 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407] >> active_file:121 inactive_file:977 isolated_file:18 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407] >> unevictable:3203 dirty:2 writeback:0 unstable:0 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407] >> slab_reclaimable:51522 slab_unreclaimable:95924 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407] mapped:2926 >> shmem:5220 pagetables:77204 bounce:0 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407] free:328371 >> free_pcp:0 free_cma:0 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193411] Node 0 >> active_anon:61155956kB inactive_anon:3014752kB active_file:864kB >> inactive_file:1432kB unevictable:10440kB isolated(anon):0kB >> isolated(file):80kB mapped:7648kB dirty:0kB writeback:0kB >> shmem:14460kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB >> writeback_tmp:0kB unstable:0kB all_unreclaimable? no >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193414] Node 1 >> active_anon:62219796kB inactive_anon:2598356kB active_file:0kB >> inactive_file:2476kB unevictable:2372kB isolated(anon):0kB >> isolated(file):0kB mapped:4056kB dirty:8kB writeback:0kB shmem:6420kB >> shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB >> unstable:0kB all_unreclaimable? no >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193416] Node 0 DMA >> free:15896kB min:124kB low:152kB high:180kB active_anon:0kB >> inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB >> writepending:0kB present:15980kB managed:15896kB mlocked:0kB >> kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB >> free_cma:0kB >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193420] >> lowmem_reserve[]: 0 1889 64319 64319 64319 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193424] Node 0 DMA32 >> free:265308kB min:15732kB low:19664kB high:23596kB >> active_anon:1642352kB inactive_anon:63060kB active_file:0kB >> inactive_file:0kB unevictable:0kB writepending:0kB present:2045868kB >> managed:1980300kB mlocked:0kB kernel_stack:48kB pagetables:832kB >> bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193428] >> lowmem_reserve[]: 0 0 62430 62430 62430 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193432] Node 0 Normal >> free:507908kB min:507928kB low:634908kB high:761888kB >> active_anon:59513604kB inactive_anon:2951692kB active_file:732kB >> inactive_file:1720kB unevictable:10440kB writepending:0kB >> present:65011712kB managed:63934936kB mlocked:10440kB >> kernel_stack:16392kB pagetables:164944kB bounce:0kB free_pcp:0kB >> local_pcp:0kB free_cma:0kB >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193436] >> lowmem_reserve[]: 0 0 0 0 0 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193440] Node 1 Normal >> free:524372kB min:524784kB low:655980kB high:787176kB >> active_anon:62219796kB inactive_anon:2598356kB active_file:504kB >> inactive_file:1392kB unevictable:2372kB writepending:8kB >> present:67108864kB managed:66056740kB mlocked:2372kB >> kernel_stack:17912kB pagetables:143040kB bounce:0kB free_pcp:0kB >> local_pcp:0kB free_cma:0kB >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193444] >> lowmem_reserve[]: 0 0 0 0 0 >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193447] Node 0 DMA: >> 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) >> 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193459] Node 0 DMA32: >> 403*4kB (UME) 238*8kB (UME) 196*16kB (UME) 102*32kB (UME) 56*64kB >> (UME) 24*128kB (UE) 25*256kB (UM) 11*512kB (UME) 4*1024kB (UE) >> 6*2048kB (UM) 54*4096kB (UM) = 266172kB >> >> >>>> >>>> >>>> k >>> >>> >>> >>> >>> -- >>> Kjetil Joergensen <kjetil@xxxxxxxxxxxx> >>> SRE, Medallia Inc >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com