Re: stable page writes: wait_on_page_writeback and packet signing

Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> · Wed, 09 Mar 2011 20:41:03 -0500

On Wed, 2011-03-09 at 16:01 -0600, Steve French wrote: 
> On Wed, Mar 9, 2011 at 3:51 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> > On Wed, Mar 09, 2011 at 01:44:24PM -0600, Steve French wrote:
> >> Following up on the discussion about how to avoid the copy into a
> >> temporary buffer for the case when a file system has to sign a page
> >> (or list of pages) that is going to be passed in an iovec to be
> >> written to the network or disk, I noticed that a few file systems do
> >> issue wait_on_page_writeback (nfs in nfs_writepages for example).
> >> Apparently some areas are being investigated to add something similar
> >> for ext4 for disk adapters that do crc checks on data being sent down
> >> to the disk.   In the cifs case it looks like cifs_writepages already
> >> does:
> >>
> >> if (wbc->sync_mode != WB_SYNC_NONE)
> >>                                 wait_on_page_writeback(page);
> 
> <snip>
> 
> > Sounds like a case for the same dirty page lifecycle as NFS: clean
> > -> dirty -> writeback -> unstable -> clean. i.e. the page is
> > unstable after the issuing of the IO until the response from the
> > server so the page can't be reclaimed while the IO is still in
> > progress at the server...
> 
> Except we don't need to wait that long with the page locked
> ie for a response from the cifs server (such as Samba or Windows
> or NetApp), just need to wait for it to get on the wire.
> Waiting for us to get the server response would
> take 10 or 100 times longer.   In any case we can't resend
> the same request to the server (the signature changes on the
> resend since the sequence number is incremented on every
> request/response so we have to recalc the checksum anyway) and
> cifs requests can't get lost (as with nfs over udp).  Keeping
> a page locked for 10milliseconds seems like a bad idea - but
> it is a little more complicated to implement (for the cifs case)
> so that we end page writeback (for the non-WB_SYNC)
> as quickly as reasonably possible so we don't kill perf.

So what if the server crashes, or you get some other transient error?

The NFS unstable write mechanism is there in order to deal with
imperfect servers that occasionally crash and lose cached data. If all
we had to deal with was perfect situations where all WRITE requests
succeed, then life would be much simpler...

Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@xxxxxxxxxx
www.netapp.com

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html