Re: OOM's on the Ceph client machine

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 13, 2010 at 10:29:43AM -0700, Sage Weil wrote:
> There have been a number of memory leak fixes since then, at least one of 
> which may be causing your problem (it was caused by an uninitialized 
> variable and didn't usually trigger for us, but may in your environment).  
> Can you retry with the latest mainline?  The benchmark completes without 
> problems in my test environment.

Sure.  This may have to wait until early next week for me to retry
with the latest mainline, but I'll definitely move to 2.6.36 in the
near future.

> If fsync on a single file in journal-less ext4 doesn't do any extra work, 
> I would just put the (preallocated) journal file together with the data on 
> each disk.  Usually that's bad news because of the journal flushing, but 
> you shouldn't have that problem.  Alternatively, you could use a small 
> separate partition on the same spindle.

I'm currently reformatting the Ceph cluster to put the journal for
/dev/sdX3 on /disk/sdX3/ceph.journal, so I'll try that test first, and
see what difference that makes.  That way I can make one change at a
time and see what difference each change in my cluster configuration
actually gives me.

BTW, this might be a good time to report a tiny little problem which I
found.  If the journal file doesn't exist, then when you run mkcephfs,
cosd will attempt to create the file for you.  But it creates it as a
4k file, and then it loops forever in FileJournal::wrap_read_bl() on
line 808, because get_top() and and header.max_size are both 4096, and
it results in it being an expensive while (1) loop.  This completely
stalls the mkcephfs operation, and it took me a while to debug.

It might be nice if cosd either (a) failed completely if the journal
file is missing, or too small, or (b) if cosd is started in mkfs mode,
and the journal file does not exist, perhaps it should create a
journal file with some suitable default size.

For stuff like this, I assume the right thing to do is to just open a
bug in tracker.newdream.net?  Is there any project-specific customs I
should be aware of?

Thanks,

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux