Ah, I was actually mistaken. After running without Valgrind, it seems I just estimated how slowed down it was. I'll leave it to run overnight as suggested. On Tue, Nov 8, 2016 at 10:44 PM, bobobo1618@xxxxxxxxx <bobobo1618@xxxxxxxxx> wrote: > Okay, I left it for 3h and it seemed to actually stabilise at around > 2.3G: http://ix.io/1DEK > > This was only after disabling other services on the system however. > Generally this much RAM isn't available to Ceph (hence the OOM > previously). > > On Tue, Nov 8, 2016 at 9:00 AM, Mark Nelson <mnelson@xxxxxxxxxx> wrote: >> It should be running much slower through valgrind so probably won't >> accumulate very quickly. That was the problem with the earlier trace, there >> wasn't enough memory used yet to really get us out of the weeds. If it's >> still accumulating quickly, try to wait until the OSD is up to 4+GB RSS if >> you can. I usually kill the valgrind/osd process with SIGTERM to make sure >> the output is preserved. Not sure what will happen with OOM killer as I >> haven't let it get that far before killing. >> >> Mark >> >> On 11/08/2016 10:37 AM, bobobo1618@xxxxxxxxx wrote: >>> >>> Unfortunately I don't think overnight is possible. The OOM will kill it >>> in hours, if not minutes. Will the output be preserved/usable if the >>> process is uncleanly terminated? >>> >>> >>> On 8 Nov 2016 08:33, "Mark Nelson" <mnelson@xxxxxxxxxx >>> <mailto:mnelson@xxxxxxxxxx>> wrote: >>> >>> Heya, >>> >>> Sorry got distracted with other stuff yesterday. Any chance you >>> could run this for longer? It's tough to tell what's going on from >>> this run unfortunately. Maybe overnight if possible. >>> >>> Thanks! >>> Mark >>> >>> >>> >>> On 11/08/2016 01:10 AM, bobobo1618@xxxxxxxxx >>> <mailto:bobobo1618@xxxxxxxxx> wrote: >>> >>> Just bumping this and CCing directly since I foolishly broke the >>> threading on my reply. >>> >>> >>> On 4 Nov. 2016 8:40 pm, "bobobo1618@xxxxxxxxx >>> <mailto:bobobo1618@xxxxxxxxx> >>> <mailto:bobobo1618@xxxxxxxxx <mailto:bobobo1618@xxxxxxxxx>>" >>> <bobobo1618@xxxxxxxxx <mailto:bobobo1618@xxxxxxxxx> >>> >>> <mailto:bobobo1618@xxxxxxxxx <mailto:bobobo1618@xxxxxxxxx>>> >>> wrote: >>> >>> > Then you can view the output data with ms_print or with >>> massif-visualizer. This may help narrow down where in the >>> code we >>> are using the memory. >>> >>> Done! I've dumped the output from ms_print here: >>> http://ix.io/1CrS >>> >>> It seems most of the memory comes from here: >>> >>> 92.78% (998,248,799B) (heap allocation functions) >>> malloc/new/new[], >>> --alloc-fns, etc. >>> ->46.63% (501,656,678B) 0xD38936: >>> ceph::buffer::create_aligned(unsigned int, unsigned int) (in >>> /usr/bin/ceph-osd) >>> | ->45.07% (484,867,174B) 0xDAFED9: >>> AsyncConnection::process() (in >>> /usr/bin/ceph-osd) >>> | | ->45.07% (484,867,174B) 0xC410EB: >>> EventCenter::process_events(int) >>> (in /usr/bin/ceph-osd) >>> | | ->45.07% (484,867,174B) 0xC45210: ??? (in >>> /usr/bin/ceph-osd) >>> | | ->45.07% (484,867,174B) 0xC6FA31D: >>> execute_native_thread_routine (thread.cc:83) >>> | | ->45.07% (484,867,174B) 0xBE06452: start_thread (in >>> /usr/lib/libpthread-2.24.so <http://libpthread-2.24.so> >>> <http://libpthread-2.24.so>) >>> | | ->45.07% (484,867,174B) 0xCFCA7DD: clone (in >>> /usr/lib/libc-2.24.so <http://libc-2.24.so> >>> <http://libc-2.24.so>) >>> | | >>> | ->01.56% (16,789,504B) in 6 places, all below massif's >>> threshold >>> (1.00%) >>> | >>> ->22.70% (244,179,072B) 0x9C9807: BitMapZone::init(long, >>> long, bool) >>> (in /usr/bin/ceph-osd) >>> | ->22.70% (244,179,072B) 0x9CACED: >>> BitMapAreaLeaf::init(long, long, >>> bool) (in /usr/bin/ceph-osd) >>> | ->22.70% (244,179,072B) 0x9CAE88: >>> BitMapAreaLeaf::BitMapAreaLeaf(long, long, bool) (in >>> /usr/bin/ceph-osd) >>> | ->22.67% (243,924,992B) 0x9CAF79: >>> BitMapAreaIN::init(long, long, >>> bool) (in /usr/bin/ceph-osd) >>> | | ->12.46% (134,086,656B) 0x9CAFBE: >>> BitMapAreaIN::init(long, >>> long, bool) (in /usr/bin/ceph-osd) >>> | | | ->12.46% (134,086,656B) 0x9CB237: >>> BitAllocator::init_check(long, long, bmap_alloc_mode, bool, >>> bool) (in >>> /usr/bin/ceph-osd) >>> | | | ->12.46% (134,086,656B) 0x9CB431: >>> BitAllocator::BitAllocator(long, long, bmap_alloc_mode, >>> bool) (in >>> /usr/bin/ceph-osd) >>> | | | ->12.46% (134,086,656B) 0x9C5C32: >>> BitMapAllocator::BitMapAllocator(long, long) (in >>> /usr/bin/ceph-osd) >>> | | | ->12.46% (134,086,656B) 0x968FF1: >>> Allocator::create(std::__cxx11::basic_string<char, >>> std::char_traits<char>, std::allocator<char> >, long, long) >>> (in >>> /usr/bin/ceph-osd) >>> | | | ->12.46% (134,086,656B) 0x87F65C: >>> BlueStore::_open_alloc() (in /usr/bin/ceph-osd) >>> | | | ->12.46% (134,086,656B) 0x8D8CDD: >>> BlueStore::mount() (in /usr/bin/ceph-osd) >>> | | | ->12.46% (134,086,656B) 0x4C15EA: >>> OSD::init() >>> (in /usr/bin/ceph-osd) >>> | | | ->12.46% (134,086,656B) 0x40854C: >>> main (in >>> /usr/bin/ceph-osd) >>> | | | >>> | | ->10.21% (109,838,336B) 0x9CB00B: >>> BitMapAreaIN::init(long, >>> long, bool) (in /usr/bin/ceph-osd) >>> | | ->10.21% (109,838,336B) 0x9CB237: >>> BitAllocator::init_check(long, long, bmap_alloc_mode, bool, >>> bool) (in >>> /usr/bin/ceph-osd) >>> | | ->10.21% (109,838,336B) 0x9CB431: >>> BitAllocator::BitAllocator(long, long, bmap_alloc_mode, >>> bool) (in >>> /usr/bin/ceph-osd) >>> | | ->10.21% (109,838,336B) 0x9C5C32: >>> BitMapAllocator::BitMapAllocator(long, long) (in >>> /usr/bin/ceph-osd) >>> | | ->10.21% (109,838,336B) 0x968FF1: >>> Allocator::create(std::__cxx11::basic_string<char, >>> std::char_traits<char>, std::allocator<char> >, long, long) >>> (in >>> /usr/bin/ceph-osd) >>> | | ->10.21% (109,838,336B) 0x87F65C: >>> BlueStore::_open_alloc() (in /usr/bin/ceph-osd) >>> | | ->10.21% (109,838,336B) 0x8D8CDD: >>> BlueStore::mount() (in /usr/bin/ceph-osd) >>> | | ->10.21% (109,838,336B) 0x4C15EA: >>> OSD::init() >>> (in /usr/bin/ceph-osd) >>> | | ->10.21% (109,838,336B) 0x40854C: >>> main (in >>> /usr/bin/ceph-osd) >>> | | >>> | ->00.02% (254,080B) in 1+ places, all below ms_print's >>> threshold (01.00%) >>> | >>> ->12.77% (137,350,728B) 0x9CACD8: BitMapAreaLeaf::init(long, >>> long, >>> bool) (in /usr/bin/ceph-osd) >>> | ->12.77% (137,350,728B) 0x9CAE88: >>> BitMapAreaLeaf::BitMapAreaLeaf(long, long, bool) (in >>> /usr/bin/ceph-osd) >>> | ->12.75% (137,207,808B) 0x9CAF79: >>> BitMapAreaIN::init(long, long, >>> bool) (in /usr/bin/ceph-osd) >>> | | ->07.01% (75,423,744B) 0x9CAFBE: >>> BitMapAreaIN::init(long, long, >>> bool) (in /usr/bin/ceph-osd) >>> | | | ->07.01% (75,423,744B) 0x9CB237: >>> BitAllocator::init_check(long, long, bmap_alloc_mode, bool, >>> bool) (in >>> /usr/bin/ceph-osd) >>> | | | ->07.01% (75,423,744B) 0x9CB431: >>> BitAllocator::BitAllocator(long, long, bmap_alloc_mode, >>> bool) (in >>> /usr/bin/ceph-osd) >>> | | | ->07.01% (75,423,744B) 0x9C5C32: >>> BitMapAllocator::BitMapAllocator(long, long) (in >>> /usr/bin/ceph-osd) >>> | | | ->07.01% (75,423,744B) 0x968FF1: >>> Allocator::create(std::__cxx11::basic_string<char, >>> std::char_traits<char>, std::allocator<char> >, long, long) >>> (in >>> /usr/bin/ceph-osd) >>> | | | ->07.01% (75,423,744B) 0x87F65C: >>> BlueStore::_open_alloc() (in /usr/bin/ceph-osd) >>> | | | ->07.01% (75,423,744B) 0x8D8CDD: >>> BlueStore::mount() >>> (in /usr/bin/ceph-osd) >>> | | | ->07.01% (75,423,744B) 0x4C15EA: >>> OSD::init() (in >>> /usr/bin/ceph-osd) >>> | | | ->07.01% (75,423,744B) 0x40854C: main >>> (in >>> /usr/bin/ceph-osd) >>> >> _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com