Re: OSDs get killed by OOM when other host goes down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yeah, if it's not memory reported by the mempools, that means it's something we aren't tracking.  Perhaps temporary allocations in some dark corner of the code, or possibly rocksdb (though 38GB of ram is obviously excessive).  heap stats are a good idea.  it's possible if neither the heap stats nor the mempool stats are helpful (and if debug bluestore = 5 and debug prioritycache =5 doesn't indicate any obvious problems with autotuning code), it may require valgrind or some other method to figure out where the memory is going.  If the memory is growing rapidly it's possible that wallclock profiling may help if you can catch where the allocations are being made.


Mark


On 11/16/21 2:42 PM, Josh Baergen wrote:
Hi Marius,

However, docker stats reports 38GB for that container.
There is a huge gap between what RAM is being used by the container what ceph daemon osd.xxx dump_mempools reports.
Take a look at "ceph daemon osd.XX heap stats" and see what it says.
You might try "ceph daemon osd.XX heap release"; I didn't think that
was supposed to be necessary with Bluestore, though. This is reaching
the end of the sort of problems I know how to track down, though, so
maybe others have some ideas.

How can I check if trim happens?
I'm not sure how to dig into this, but if your "up" count = your "in"
count in "ceph -s", it should be trimming.

Josh
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux