On 05/14/2013 10:44 AM, Alex Elder wrote: > On 05/09/2013 09:42 AM, Jim Schutt wrote: >> Ceph's encode_caps_cb() worked hard to not call __page_cache_alloc while >> holding a lock, but it's spoiled because ceph_pagelist_addpage() always >> calls kmap(), which might sleep. Here's the result: > > I finally took a close look at this today, Jim. Sorry > for the delay. > No worries - thanks for taking a look. > The issue is formatting the reconnect message--which will > hold an arbitrary amount of data and therefore which we'll > need to do some allocation (and kmap) for--in the face of > having to hold the flock spinlock while doing so. > > And as you found, ceph_pagelist_addpage(), which is called > by ceph_pagelist_append(), calls kmap() even if it doesn't > need to allocate anything. This means that despite reserving > the pages, those pages are in the free list and because they'll > need to be the subject of kmap() their preallocation doesn't > help. > > Your solution was to pre-allocate a buffer, format the locks > into that buffer while holding the lock, then append the > buffer contents to a pagelist after releasing the lock. You > check for a changing (increasing) lock count while you format > the locks, which is good. > > So... Given that, I think your change looks good. It's a shame > we can't format directly into the pagelist buffer but this won't > happen much so it's not a big deal. I have a few small suggestions, > below. > > I do find some byte order bugs though. They aren't your doing, > but I think they ought to be fixed first, as a separate patch > that would precede this one. The bug is that the lock counts > that are put into the buffer (num_fcntl_locks and num_flock_locks) > are not properly byte-swapped. I'll point it out inline > in your code, below. > > I'll say that what you have is OK. Consider my suggestions, and > if you choose not to fix the byte order bugs, please let me know. I'll happily fix up a v2 series with your suggestions addressed. Thanks for catching those issues. Stay tuned... Thanks -- Jim -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html