Re: stable page writes: wait_on_page_writeback and packet signing

Jeff Layton <jlayton@xxxxxxxxxx> · Thu, 10 Mar 2011 08:47:24 -0500

On Thu, 10 Mar 2011 08:32:09 -0500
Chris Mason <chris.mason@xxxxxxxxxx> wrote:

> Excerpts from Jeff Layton's message of 2011-03-10 08:16:38 -0500:
> > On Thu, 10 Mar 2011 04:26:31 -0800 (PST)
> > Chris Mason <chris.mason@xxxxxxxxxx> wrote:
> > 
> > > Excerpts from Steve French's message of 2011-03-09 17:13:06 -0500:
> > > > On Wed, Mar 9, 2011 at 3:58 PM, Chris Mason <chris.mason@xxxxxxxxxx> wrote:
> > > > > Excerpts from Dave Chinner's message of 2011-03-09 16:51:48 -0500:
> > > > >> On Wed, Mar 09, 2011 at 01:44:24PM -0600, Steve French wrote:
> > > > >> > Have alternative approaches, other than using wait_on_page_writeback,
> > > > >> > been considered for solving the stable page write problem in similar
> > > > >> > cases (since only about 1 out of 5 linux file systems uses this call
> > > > >> > today).
> > > > >>
> > > > >> I think that is incorrect. write_cache_pages() does:
> > > > >>
> > > > >>  929                         lock_page(page);
> > > > >> .....
> > > > >>  950                         if (PageWriteback(page)) {
> > > > >>  951                                 if (wbc->sync_mode != WB_SYNC_NONE)
> > > > >>  952                                         wait_on_page_writeback(page);
> > > > >>  953                                 else
> > > > >>  954                                         goto continue_unlock;
> > > > >>  955                         }
> > > > >>  956
> > > > >>  957                         BUG_ON(PageWriteback(page));
> > > > >>  958                         if (!clear_page_dirty_for_io(page))
> > > > >>  959                                 goto continue_unlock;
> > > > >>  960
> > > > >>  961                         trace_wbc_writepage(wbc, mapping->backing_dev_info);
> > > > >>  962                         ret = (*writepage)(page, wbc, data);
> > > > >>
> > > > >> so every filesystem using the generic_writepages code already does
> > > > >> this check and wait before .writepage is called. Hence only the
> > > > >> filesystems that do not use generic_writepages() or
> > > > >> mpage_writepages() need a specific check, and that means most
> > > > >> filesystems are actually waiting on writeback pages correctly.
> > > > >
> > > > > But checking here just means we don't start writeback on a page that is
> > > > > writeback, which is a good idea but not really related to stable pages?
> > > > >
> > > > > stable pages means we don't let mmap'd pages or file_write muck around
> > > > > with the pages while they are in writeback, so we need to wait in
> > > > > file_write and page_mkwrite.
> > > > 
> > > > Isn't the file_write case covered by the i_mutex as
> > > > Documentation/filesystems/Locking implies (for write_begin/write_end).
> > > > 
> > > 
> > > Does cifs take i_mutex before writepage? The disk based filesystems
> > > don't.  So, i_mutex protects file_write from other procs jumping into
> > > file_write, but it doesn't protect writeback from file_write jumping in
> > > and changing the pages while they are being sent to storage (or over the
> > > wire).
> > > 
> > > Basically the model needs to be:
> > > 
> > > file_write:
> > >     lock the page
> > >     wait on page writeback
> > > 
> > >     < new writeback cannot start because of the page lock >
> > >     copy_from_user
> > >     unlock the page
> > > 
> > > We also use page_mkwrite to get notified when userland wants to change
> > > some page it has given to mmap.  That needs to wait on page writeback as
> > > well.
> > > 
> > 
> > No, cifs doesn't take the i_mutex in writepage, but the page is locked.
> > cifs_write_begin calls grab_cache_page_write_begin, which returns a
> > locked page and it's not unlocked until cifs_write_end.
> 
> Ah ok, so you've got the page locked the whole time it is being sent
> over the wire?  The disk based filesystems split it and drop the page
> lock once the page is set writeback, which is why we need the extra
> waits.
> 
> So in your case you should just need a page_mkwrite that locks the page.
> 

Right, or do a wait_on_page_writeback. I think that may have a little
less overhead since we won't need to unlock it and it may mean less
serialization if there are other contenders for the page lock.

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html