Here it is after running overnight (~9h): http://ix.io/1DNi On Tue, Nov 8, 2016 at 11:00 PM, bobobo1618@xxxxxxxxx <bobobo1618@xxxxxxxxx> wrote: > Ah, I was actually mistaken. After running without Valgrind, it seems > I just estimated how slowed down it was. I'll leave it to run > overnight as suggested. > > On Tue, Nov 8, 2016 at 10:44 PM, bobobo1618@xxxxxxxxx > <bobobo1618@xxxxxxxxx> wrote: >> Okay, I left it for 3h and it seemed to actually stabilise at around >> 2.3G: http://ix.io/1DEK >> >> This was only after disabling other services on the system however. >> Generally this much RAM isn't available to Ceph (hence the OOM >> previously). >> >> On Tue, Nov 8, 2016 at 9:00 AM, Mark Nelson <mnelson@xxxxxxxxxx> wrote: >>> It should be running much slower through valgrind so probably won't >>> accumulate very quickly. That was the problem with the earlier trace, there >>> wasn't enough memory used yet to really get us out of the weeds. If it's >>> still accumulating quickly, try to wait until the OSD is up to 4+GB RSS if >>> you can. I usually kill the valgrind/osd process with SIGTERM to make sure >>> the output is preserved. Not sure what will happen with OOM killer as I >>> haven't let it get that far before killing. >>> >>> Mark >>> >>> On 11/08/2016 10:37 AM, bobobo1618@xxxxxxxxx wrote: >>>> >>>> Unfortunately I don't think overnight is possible. The OOM will kill it >>>> in hours, if not minutes. Will the output be preserved/usable if the >>>> process is uncleanly terminated? >>>> >>>> >>>> On 8 Nov 2016 08:33, "Mark Nelson" <mnelson@xxxxxxxxxx >>>> <mailto:mnelson@xxxxxxxxxx>> wrote: >>>> >>>> Heya, >>>> >>>> Sorry got distracted with other stuff yesterday. Any chance you >>>> could run this for longer? It's tough to tell what's going on from >>>> this run unfortunately. Maybe overnight if possible. >>>> >>>> Thanks! >>>> Mark >>>> >>>> >>>> >>>> On 11/08/2016 01:10 AM, bobobo1618@xxxxxxxxx >>>> <mailto:bobobo1618@xxxxxxxxx> wrote: >>>> >>>> Just bumping this and CCing directly since I foolishly broke the >>>> threading on my reply. >>>> >>>> >>>> On 4 Nov. 2016 8:40 pm, "bobobo1618@xxxxxxxxx >>>> <mailto:bobobo1618@xxxxxxxxx> >>>> <mailto:bobobo1618@xxxxxxxxx <mailto:bobobo1618@xxxxxxxxx>>" >>>> <bobobo1618@xxxxxxxxx <mailto:bobobo1618@xxxxxxxxx> >>>> >>>> <mailto:bobobo1618@xxxxxxxxx <mailto:bobobo1618@xxxxxxxxx>>> >>>> wrote: >>>> >>>> > Then you can view the output data with ms_print or with >>>> massif-visualizer. This may help narrow down where in the >>>> code we >>>> are using the memory. >>>> >>>> Done! I've dumped the output from ms_print here: >>>> http://ix.io/1CrS >>>> >>>> It seems most of the memory comes from here: >>>> >>>> 92.78% (998,248,799B) (heap allocation functions) >>>> malloc/new/new[], >>>> --alloc-fns, etc. >>>> ->46.63% (501,656,678B) 0xD38936: >>>> ceph::buffer::create_aligned(unsigned int, unsigned int) (in >>>> /usr/bin/ceph-osd) >>>> | ->45.07% (484,867,174B) 0xDAFED9: >>>> AsyncConnection::process() (in >>>> /usr/bin/ceph-osd) >>>> | | ->45.07% (484,867,174B) 0xC410EB: >>>> EventCenter::process_events(int) >>>> (in /usr/bin/ceph-osd) >>>> | | ->45.07% (484,867,174B) 0xC45210: ??? (in >>>> /usr/bin/ceph-osd) >>>> | | ->45.07% (484,867,174B) 0xC6FA31D: >>>> execute_native_thread_routine (thread.cc:83) >>>> | | ->45.07% (484,867,174B) 0xBE06452: start_thread (in >>>> /usr/lib/libpthread-2.24.so <http://libpthread-2.24.so> >>>> <http://libpthread-2.24.so>) >>>> | | ->45.07% (484,867,174B) 0xCFCA7DD: clone (in >>>> /usr/lib/libc-2.24.so <http://libc-2.24.so> >>>> <http://libc-2.24.so>) >>>> | | >>>> | ->01.56% (16,789,504B) in 6 places, all below massif's >>>> threshold >>>> (1.00%) >>>> | >>>> ->22.70% (244,179,072B) 0x9C9807: BitMapZone::init(long, >>>> long, bool) >>>> (in /usr/bin/ceph-osd) >>>> | ->22.70% (244,179,072B) 0x9CACED: >>>> BitMapAreaLeaf::init(long, long, >>>> bool) (in /usr/bin/ceph-osd) >>>> | ->22.70% (244,179,072B) 0x9CAE88: >>>> BitMapAreaLeaf::BitMapAreaLeaf(long, long, bool) (in >>>> /usr/bin/ceph-osd) >>>> | ->22.67% (243,924,992B) 0x9CAF79: >>>> BitMapAreaIN::init(long, long, >>>> bool) (in /usr/bin/ceph-osd) >>>> | | ->12.46% (134,086,656B) 0x9CAFBE: >>>> BitMapAreaIN::init(long, >>>> long, bool) (in /usr/bin/ceph-osd) >>>> | | | ->12.46% (134,086,656B) 0x9CB237: >>>> BitAllocator::init_check(long, long, bmap_alloc_mode, bool, >>>> bool) (in >>>> /usr/bin/ceph-osd) >>>> | | | ->12.46% (134,086,656B) 0x9CB431: >>>> BitAllocator::BitAllocator(long, long, bmap_alloc_mode, >>>> bool) (in >>>> /usr/bin/ceph-osd) >>>> | | | ->12.46% (134,086,656B) 0x9C5C32: >>>> BitMapAllocator::BitMapAllocator(long, long) (in >>>> /usr/bin/ceph-osd) >>>> | | | ->12.46% (134,086,656B) 0x968FF1: >>>> Allocator::create(std::__cxx11::basic_string<char, >>>> std::char_traits<char>, std::allocator<char> >, long, long) >>>> (in >>>> /usr/bin/ceph-osd) >>>> | | | ->12.46% (134,086,656B) 0x87F65C: >>>> BlueStore::_open_alloc() (in /usr/bin/ceph-osd) >>>> | | | ->12.46% (134,086,656B) 0x8D8CDD: >>>> BlueStore::mount() (in /usr/bin/ceph-osd) >>>> | | | ->12.46% (134,086,656B) 0x4C15EA: >>>> OSD::init() >>>> (in /usr/bin/ceph-osd) >>>> | | | ->12.46% (134,086,656B) 0x40854C: >>>> main (in >>>> /usr/bin/ceph-osd) >>>> | | | >>>> | | ->10.21% (109,838,336B) 0x9CB00B: >>>> BitMapAreaIN::init(long, >>>> long, bool) (in /usr/bin/ceph-osd) >>>> | | ->10.21% (109,838,336B) 0x9CB237: >>>> BitAllocator::init_check(long, long, bmap_alloc_mode, bool, >>>> bool) (in >>>> /usr/bin/ceph-osd) >>>> | | ->10.21% (109,838,336B) 0x9CB431: >>>> BitAllocator::BitAllocator(long, long, bmap_alloc_mode, >>>> bool) (in >>>> /usr/bin/ceph-osd) >>>> | | ->10.21% (109,838,336B) 0x9C5C32: >>>> BitMapAllocator::BitMapAllocator(long, long) (in >>>> /usr/bin/ceph-osd) >>>> | | ->10.21% (109,838,336B) 0x968FF1: >>>> Allocator::create(std::__cxx11::basic_string<char, >>>> std::char_traits<char>, std::allocator<char> >, long, long) >>>> (in >>>> /usr/bin/ceph-osd) >>>> | | ->10.21% (109,838,336B) 0x87F65C: >>>> BlueStore::_open_alloc() (in /usr/bin/ceph-osd) >>>> | | ->10.21% (109,838,336B) 0x8D8CDD: >>>> BlueStore::mount() (in /usr/bin/ceph-osd) >>>> | | ->10.21% (109,838,336B) 0x4C15EA: >>>> OSD::init() >>>> (in /usr/bin/ceph-osd) >>>> | | ->10.21% (109,838,336B) 0x40854C: >>>> main (in >>>> /usr/bin/ceph-osd) >>>> | | >>>> | ->00.02% (254,080B) in 1+ places, all below ms_print's >>>> threshold (01.00%) >>>> | >>>> ->12.77% (137,350,728B) 0x9CACD8: BitMapAreaLeaf::init(long, >>>> long, >>>> bool) (in /usr/bin/ceph-osd) >>>> | ->12.77% (137,350,728B) 0x9CAE88: >>>> BitMapAreaLeaf::BitMapAreaLeaf(long, long, bool) (in >>>> /usr/bin/ceph-osd) >>>> | ->12.75% (137,207,808B) 0x9CAF79: >>>> BitMapAreaIN::init(long, long, >>>> bool) (in /usr/bin/ceph-osd) >>>> | | ->07.01% (75,423,744B) 0x9CAFBE: >>>> BitMapAreaIN::init(long, long, >>>> bool) (in /usr/bin/ceph-osd) >>>> | | | ->07.01% (75,423,744B) 0x9CB237: >>>> BitAllocator::init_check(long, long, bmap_alloc_mode, bool, >>>> bool) (in >>>> /usr/bin/ceph-osd) >>>> | | | ->07.01% (75,423,744B) 0x9CB431: >>>> BitAllocator::BitAllocator(long, long, bmap_alloc_mode, >>>> bool) (in >>>> /usr/bin/ceph-osd) >>>> | | | ->07.01% (75,423,744B) 0x9C5C32: >>>> BitMapAllocator::BitMapAllocator(long, long) (in >>>> /usr/bin/ceph-osd) >>>> | | | ->07.01% (75,423,744B) 0x968FF1: >>>> Allocator::create(std::__cxx11::basic_string<char, >>>> std::char_traits<char>, std::allocator<char> >, long, long) >>>> (in >>>> /usr/bin/ceph-osd) >>>> | | | ->07.01% (75,423,744B) 0x87F65C: >>>> BlueStore::_open_alloc() (in /usr/bin/ceph-osd) >>>> | | | ->07.01% (75,423,744B) 0x8D8CDD: >>>> BlueStore::mount() >>>> (in /usr/bin/ceph-osd) >>>> | | | ->07.01% (75,423,744B) 0x4C15EA: >>>> OSD::init() (in >>>> /usr/bin/ceph-osd) >>>> | | | ->07.01% (75,423,744B) 0x40854C: main >>>> (in >>>> /usr/bin/ceph-osd) >>>> >>> _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com