Re: [RFC] relaxed barrier semantics

Vivek Goyal <vgoyal@xxxxxxxxxx> · Thu, 29 Jul 2010 16:02:17 -0400

On Thu, Jul 29, 2010 at 10:42:25AM +0200, Christoph Hellwig wrote:
> On Wed, Jul 28, 2010 at 10:43:34PM -0400, Vivek Goyal wrote:
> > I guess we will require something like set_buffer_preflush_fua() kind of
> > operation so that we preflush the cache to make sure everything before
> > commit block is on platter and then do commit block write with FUA
> > to make sure commit block is on platter.
> 
> No more messing with buffer flags for barriers / cache flush options
> please.  It's a flag for the I/O submission, not buffer state.  See
> my patch from June to remove BH_Ordered if you're interested.

> 
> > This is assuming that before issuing commit block request we have waited
> > for completion of rest of the journal data. This will make sure none of
> > that journal data is in request queue. Then if we issue commit with 
> > preflush and FUA, it should make sure all the journal blocks are on
> > disk and then commit block is on disk.
> > 
> > So as long as we wait in filesystem for completion of the requests commit
> > block is dependent on, before we issue commit request, we should not
> > require request queue drain and preflush and FUA write probably should
> > be fine.
> 
> We do not require the drain for that case.  The flush is more difficult,
> because it's entirely possible that we have state that we require to be
> on disk before writing out a log buffer.  For XFS that's two cases:
> 
>  (1) we require the actual file data to be on disk before logging the
>      file size update to avoid stale data exposure in case the log
>      buffer hits the disk before the data
>  (2) we require that the buffers writing back metadata actually made it
>      to disk before pushing the log tail
> 
> (1) means we'll always a pre-flush when a log buffer contains a size
> update from an appending write.
> (2) means we need to more complicated tracking of the tail lsn, e.g.
> by caching it somewhere and only updating the cached value after a
> cache flush happened, with a way to force one if needed.
> 
> All that is at least as complicated as it sounds.  While I have a
> working prototype just going with the relaxed barriers as a first step
> is probably.

There are so many mails on this topic now that I am kind of lost. I guess
this has already been asked but I will ask one more time.

Looks like you still want to go with option 2 where you will scan the file
system code for requirement of DRAIN semantics and everything is fine then for
devices no supporting volatile caches, you will mark request queue as NONE.

This solves the problem on devices with WCE=0 but what about devices with
WCE=1. If file systems anyway don't require DRAIN semantics, then we
should not require it on devices with WCE=1 also?

If yes, then why not go with another variant of barriers which don't
perform DRAIN and just do PREFLUSH + FUA (or post flush for devices not
supporting FUA). And then file systems can slowly move to using this non
draining barrier usage wherever appropriate.

The advantage here is that it should save us request queue DRAIN even
on devices with WCE=1. 

Am I missing something very obivious here?

Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html