* Peter Zijlstra (peterz@xxxxxxxxxxxxx) wrote: > On Thu, 2010-05-13 at 12:31 -0400, Mathieu Desnoyers wrote: > > > > In addition, this would play well with mmap() too: we can simply add a > > ring_buffer_get_mmap_offset() method to the backend (exported through another > > ioctl) that would let user-space know the start of the mmap'd buffer range > > currently owned by the reader. So we can inform user-space of the currently > > owned page range without even changing the underlying memory map. > > I still think keeping refs to splice pages is tricky at best. Suppose > they're spliced into the pagecache of a file, it could stay there for a > long time under some conditions. > > Also, the splice-client (say the pagecache) and the mmap will both want > the pageframe to contain different information. [CCing memory management specialists] You bring a very interesting point. Let me describe what I want to achieve, and see what others have to say about it: I want the ring buffer to allocate pages only at ring buffer creation (never while tracing). There are a few reasons why I want to do that, ranging from improved performance to limited system disturbance. Now let suppose we have the synchronization mechanism (detailed in the original thread, but not relevant to this part of the discussion) that lets us give the pages to the ring buffer "reader", which sends them to splice() so it can use them as write buffers. Let also suppose that the ring buffer reader blocks until the pages are written to the disk (synchronous write). In my scheme, the reader still has pointers to these pages. The point you bring here is that when the ring buffer "reader" is woken up, these pages could still be in the page cache. So when the reader gives these pages back to the ring buffer (so they can be used for writing again), the page cache may still hold a reference to them, so the pages in the page cache and the version on disk could be unsynchronized, and therefore this could possibly lead to trace file corruption (in the worse case). So I have three questions here: 1 - could we enforce removal of these pages from the page cache by calling "page_cache_release()" before giving these pages back to the ring buffer ? 2 - or maybe is there a page flag we could specify when we allocate them to ask for these pages to never be put in the page cache ? (but they should be still usable as write buffers) 3 - is there something more we need to do to grab a reference on the pages before passing them to splice(), so that when we call page_cache_release() they don't get reclaimed ? Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>