Just to update, this is still an issue as of the latest Git commit (64bcf92e87f9fbb3045de49b7deb53aca1989123).
On Fri, Nov 11, 2016 at 1:31 PM, bobobo1618@xxxxxxxxx <bobobo1618@xxxxxxxxx> wrote:
Here's another: http://termbin.com/smnm
On Fri, Nov 11, 2016 at 1:28 PM, Sage Weil <sweil@xxxxxxxxxx> wrote:
> On Fri, 11 Nov 2016, bobobo1618@xxxxxxxxx wrote:
>> Any more data needed?
>>
>> On Wed, Nov 9, 2016 at 9:29 AM, bobobo1618@xxxxxxxxx
>> <bobobo1618@xxxxxxxxx> wrote:
>> > Here it is after running overnight (~9h): http://ix.io/1DNi
>
> I'm getting a 500 on that URL...
>
> sage
>
>
>> >
>> > On Tue, Nov 8, 2016 at 11:00 PM, bobobo1618@xxxxxxxxx
>> > <bobobo1618@xxxxxxxxx> wrote:
>> >> Ah, I was actually mistaken. After running without Valgrind, it seems
>> >> I just estimated how slowed down it was. I'll leave it to run
>> >> overnight as suggested.
>> >>
>> >> On Tue, Nov 8, 2016 at 10:44 PM, bobobo1618@xxxxxxxxx
>> >> <bobobo1618@xxxxxxxxx> wrote:
>> >>> Okay, I left it for 3h and it seemed to actually stabilise at around
>> >>> 2.3G: http://ix.io/1DEK
>> >>>
>> >>> This was only after disabling other services on the system however.
>> >>> Generally this much RAM isn't available to Ceph (hence the OOM
>> >>> previously).
>> >>>
>> >>> On Tue, Nov 8, 2016 at 9:00 AM, Mark Nelson <mnelson@xxxxxxxxxx> wrote:
>> >>>> It should be running much slower through valgrind so probably won't
>> >>>> accumulate very quickly. That was the problem with the earlier trace, there
>> >>>> wasn't enough memory used yet to really get us out of the weeds. If it's
>> >>>> still accumulating quickly, try to wait until the OSD is up to 4+GB RSS if
>> >>>> you can. I usually kill the valgrind/osd process with SIGTERM to make sure
>> >>>> the output is preserved. Not sure what will happen with OOM killer as I
>> >>>> haven't let it get that far before killing.
>> >>>>
>> >>>> Mark
>> >>>>
>> >>>> On 11/08/2016 10:37 AM, bobobo1618@xxxxxxxxx wrote:
>> >>>>>
>> >>>>> Unfortunately I don't think overnight is possible. The OOM will kill it
>> >>>>> in hours, if not minutes. Will the output be preserved/usable if the
>> >>>>> process is uncleanly terminated?
>> >>>>>
>> >>>>>
>> >>>>> On 8 Nov 2016 08:33, "Mark Nelson" <mnelson@xxxxxxxxxx
>> >>>>> <mailto:mnelson@xxxxxxxxxx>> wrote:
>> >>>>>
>> >>>>> Heya,
>> >>>>>
>> >>>>> Sorry got distracted with other stuff yesterday. Any chance you
>> >>>>> could run this for longer? It's tough to tell what's going on from
>> >>>>> this run unfortunately. Maybe overnight if possible.
>> >>>>>
>> >>>>> Thanks!
>> >>>>> Mark
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On 11/08/2016 01:10 AM, bobobo1618@xxxxxxxxx
>> >>>>> <mailto:bobobo1618@xxxxxxxxx> wrote:
>> >>>>>
>> >>>>> Just bumping this and CCing directly since I foolishly broke the
>> >>>>> threading on my reply.
>> >>>>>
>> >>>>>
>> >>>>> On 4 Nov. 2016 8:40 pm, "bobobo1618@xxxxxxxxx
>> >>>>> <mailto:bobobo1618@xxxxxxxxx>
>> >>>>> <mailto:bobobo1618@xxxxxxxxx <mailto:bobobo1618@xxxxxxxxx>>"
>> >>>>> <bobobo1618@xxxxxxxxx <mailto:bobobo1618@xxxxxxxxx>
>> >>>>>
>> >>>>> <mailto:bobobo1618@xxxxxxxxx <mailto:bobobo1618@xxxxxxxxx>>>
>> >>>>> wrote:
>> >>>>>
>> >>>>> > Then you can view the output data with ms_print or with
>> >>>>> massif-visualizer. This may help narrow down where in the
>> >>>>> code we
>> >>>>> are using the memory.
>> >>>>>
>> >>>>> Done! I've dumped the output from ms_print here:
>> >>>>> http://ix.io/1CrS
>> >>>>>
>> >>>>> It seems most of the memory comes from here:
>> >>>>>
>> >>>>> 92.78% (998,248,799B) (heap allocation functions)
>> >>>>> malloc/new/new[],
>> >>>>> --alloc-fns, etc.
>> >>>>> ->46.63% (501,656,678B) 0xD38936:
>> >>>>> ceph::buffer::create_aligned(unsigned int, unsigned int) (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | ->45.07% (484,867,174B) 0xDAFED9:
>> >>>>> AsyncConnection::process() (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | ->45.07% (484,867,174B) 0xC410EB:
>> >>>>> EventCenter::process_events(int)
>> >>>>> (in /usr/bin/ceph-osd)
>> >>>>> | | ->45.07% (484,867,174B) 0xC45210: ??? (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | ->45.07% (484,867,174B) 0xC6FA31D:
>> >>>>> execute_native_thread_routine (thread.cc:83)
>> >>>>> | | ->45.07% (484,867,174B) 0xBE06452: start_thread (in
>> >>>>> /usr/lib/libpthread-2.24.so <http://libpthread-2.24.so>
>> >>>>> <http://libpthread-2.24.so>)
>> >>>>> | | ->45.07% (484,867,174B) 0xCFCA7DD: clone (in
>> >>>>> /usr/lib/libc-2.24.so <http://libc-2.24.so>
>> >>>>> <http://libc-2.24.so>)
>> >>>>> | |
>> >>>>> | ->01.56% (16,789,504B) in 6 places, all below massif's
>> >>>>> threshold
>> >>>>> (1.00%)
>> >>>>> |
>> >>>>> ->22.70% (244,179,072B) 0x9C9807: BitMapZone::init(long,
>> >>>>> long, bool)
>> >>>>> (in /usr/bin/ceph-osd)
>> >>>>> | ->22.70% (244,179,072B) 0x9CACED:
>> >>>>> BitMapAreaLeaf::init(long, long,
>> >>>>> bool) (in /usr/bin/ceph-osd)
>> >>>>> | ->22.70% (244,179,072B) 0x9CAE88:
>> >>>>> BitMapAreaLeaf::BitMapAreaLeaf(long, long, bool) (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | ->22.67% (243,924,992B) 0x9CAF79:
>> >>>>> BitMapAreaIN::init(long, long,
>> >>>>> bool) (in /usr/bin/ceph-osd)
>> >>>>> | | ->12.46% (134,086,656B) 0x9CAFBE:
>> >>>>> BitMapAreaIN::init(long,
>> >>>>> long, bool) (in /usr/bin/ceph-osd)
>> >>>>> | | | ->12.46% (134,086,656B) 0x9CB237:
>> >>>>> BitAllocator::init_check(long, long, bmap_alloc_mode, bool,
>> >>>>> bool) (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | | ->12.46% (134,086,656B) 0x9CB431:
>> >>>>> BitAllocator::BitAllocator(long, long, bmap_alloc_mode,
>> >>>>> bool) (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | | ->12.46% (134,086,656B) 0x9C5C32:
>> >>>>> BitMapAllocator::BitMapAllocator(long, long) (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | | ->12.46% (134,086,656B) 0x968FF1:
>> >>>>> Allocator::create(std::__cxx11::basic_string<char,
>> >>>>> std::char_traits<char>, std::allocator<char> >, long, long)
>> >>>>> (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | | ->12.46% (134,086,656B) 0x87F65C:
>> >>>>> BlueStore::_open_alloc() (in /usr/bin/ceph-osd)
>> >>>>> | | | ->12.46% (134,086,656B) 0x8D8CDD:
>> >>>>> BlueStore::mount() (in /usr/bin/ceph-osd)
>> >>>>> | | | ->12.46% (134,086,656B) 0x4C15EA:
>> >>>>> OSD::init()
>> >>>>> (in /usr/bin/ceph-osd)
>> >>>>> | | | ->12.46% (134,086,656B) 0x40854C:
>> >>>>> main (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | |
>> >>>>> | | ->10.21% (109,838,336B) 0x9CB00B:
>> >>>>> BitMapAreaIN::init(long,
>> >>>>> long, bool) (in /usr/bin/ceph-osd)
>> >>>>> | | ->10.21% (109,838,336B) 0x9CB237:
>> >>>>> BitAllocator::init_check(long, long, bmap_alloc_mode, bool,
>> >>>>> bool) (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | ->10.21% (109,838,336B) 0x9CB431:
>> >>>>> BitAllocator::BitAllocator(long, long, bmap_alloc_mode,
>> >>>>> bool) (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | ->10.21% (109,838,336B) 0x9C5C32:
>> >>>>> BitMapAllocator::BitMapAllocator(long, long) (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | ->10.21% (109,838,336B) 0x968FF1:
>> >>>>> Allocator::create(std::__cxx11::basic_string<char,
>> >>>>> std::char_traits<char>, std::allocator<char> >, long, long)
>> >>>>> (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | ->10.21% (109,838,336B) 0x87F65C:
>> >>>>> BlueStore::_open_alloc() (in /usr/bin/ceph-osd)
>> >>>>> | | ->10.21% (109,838,336B) 0x8D8CDD:
>> >>>>> BlueStore::mount() (in /usr/bin/ceph-osd)
>> >>>>> | | ->10.21% (109,838,336B) 0x4C15EA:
>> >>>>> OSD::init()
>> >>>>> (in /usr/bin/ceph-osd)
>> >>>>> | | ->10.21% (109,838,336B) 0x40854C:
>> >>>>> main (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | |
>> >>>>> | ->00.02% (254,080B) in 1+ places, all below ms_print's
>> >>>>> threshold (01.00%)
>> >>>>> |
>> >>>>> ->12.77% (137,350,728B) 0x9CACD8: BitMapAreaLeaf::init(long,
>> >>>>> long,
>> >>>>> bool) (in /usr/bin/ceph-osd)
>> >>>>> | ->12.77% (137,350,728B) 0x9CAE88:
>> >>>>> BitMapAreaLeaf::BitMapAreaLeaf(long, long, bool) (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | ->12.75% (137,207,808B) 0x9CAF79:
>> >>>>> BitMapAreaIN::init(long, long,
>> >>>>> bool) (in /usr/bin/ceph-osd)
>> >>>>> | | ->07.01% (75,423,744B) 0x9CAFBE:
>> >>>>> BitMapAreaIN::init(long, long,
>> >>>>> bool) (in /usr/bin/ceph-osd)
>> >>>>> | | | ->07.01% (75,423,744B) 0x9CB237:
>> >>>>> BitAllocator::init_check(long, long, bmap_alloc_mode, bool,
>> >>>>> bool) (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | | ->07.01% (75,423,744B) 0x9CB431:
>> >>>>> BitAllocator::BitAllocator(long, long, bmap_alloc_mode,
>> >>>>> bool) (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | | ->07.01% (75,423,744B) 0x9C5C32:
>> >>>>> BitMapAllocator::BitMapAllocator(long, long) (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | | ->07.01% (75,423,744B) 0x968FF1:
>> >>>>> Allocator::create(std::__cxx11::basic_string<char,
>> >>>>> std::char_traits<char>, std::allocator<char> >, long, long)
>> >>>>> (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | | ->07.01% (75,423,744B) 0x87F65C:
>> >>>>> BlueStore::_open_alloc() (in /usr/bin/ceph-osd)
>> >>>>> | | | ->07.01% (75,423,744B) 0x8D8CDD:
>> >>>>> BlueStore::mount()
>> >>>>> (in /usr/bin/ceph-osd)
>> >>>>> | | | ->07.01% (75,423,744B) 0x4C15EA:
>> >>>>> OSD::init() (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>> | | | ->07.01% (75,423,744B) 0x40854C: main
>> >>>>> (in
>> >>>>> /usr/bin/ceph-osd)
>> >>>>>
>> >>>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
>>
>>
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com