On Tue, Nov 18, 2014 at 12:54 AM, Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> wrote: > I did have a problem in my secondary cluster that sounds similar to yours. > I was using XFS, and traced my problem back to 64 kB inodes (osd mkfs > options xfs = -i size=64k). This showed up with a lot of "XFS: possible > memory allocation deadlock in kmem_alloc" in the kernel logs. I was able to > keep things limping along by flushing the cache frequently, but I eventually > re-formatted every OSD to get rid of the 64k inodes. > > After I finished the reformat, I had problems because of deep-scrubbing. > While reformatting, I disabled deep-scrubbing. Once I re-enabled it, Ceph > wanted to deep-scrub the whole cluster, and sometimes 90% of my OSDs would > be doing a deep-scrub. I'm manually deep-scrubbing now, trying to spread > out the schedule a bit. Once this finishes in a few day, I should be able > to re-enable deep-scrubbing and keep my HEALTH_OK. > > Would you mind to check suggestions by following mine hints or hints from mentioned URLs from there http://marc.info/?l=linux-mm&m=141607712831090&w=2 with 64k again? As for me, I am not observing lock loop after setting min_free_kbytes for a half of gigabyte per OSD. Even if your locks has a different nature, it may be worthy to try anyway. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com