RE: [nfsv4] 4.1 client - LAYOUTCOMMIT & close

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2010-07-07 at 16:39 -0400, Daniel.Muntz@xxxxxxx wrote:
> To bring this discussion full circle, since we agree that a compliant
> server can implement a scheme where written data does not become visible
> until after a LAYOUTCOMMIT, do we also agree that LAYOUTCOMMIT is a
> "MUST" from a compliant client (independent of layout type)?

Yes. I would agree that the client cannot rely on the updates being made
visible if it fails to send the LAYOUTCOMMIT. My point was simply that a
compliant server MUST also have a valid strategy for dealing with the
case where the client doesn't send it.

Cheers
  Trond

>   -Dan
> 
> > -----Original Message-----
> > From: nfsv4-bounces@xxxxxxxx [mailto:nfsv4-bounces@xxxxxxxx] 
> > On Behalf Of Trond Myklebust
> > Sent: Wednesday, July 07, 2010 7:04 AM
> > To: Benny Halevy
> > Cc: andros@xxxxxxxxxx; linux-nfs@xxxxxxxxxxxxxxx; Garth 
> > Gibson; Brent Welch; NFSv4
> > Subject: Re: [nfsv4] 4.1 client - LAYOUTCOMMIT & close
> > 
> > On Wed, 2010-07-07 at 16:51 +0300, Benny Halevy wrote:
> > > On Jul. 07, 2010, 16:18 +0300, Trond Myklebust 
> > <Trond.Myklebust@xxxxxxxxxx> wrote:
> > > > On Wed, 2010-07-07 at 09:06 -0400, Trond Myklebust wrote:
> > > >> On Wed, 2010-07-07 at 15:05 +0300, Benny Halevy wrote:
> > > >>> On Jul. 06, 2010, 23:40 +0300, Trond Myklebust 
> > <trond.myklebust@xxxxxxxxxx> wrote:
> > > >>>> On Tue, 2010-07-06 at 15:20 -0400, Daniel.Muntz@xxxxxxx wrote: 
> > > >>>>> The COMMIT to the DS, ttbomk, commits data on the DS. 
> >  I see it as
> > > >>>>> orthogonal to updating the metadata on the MDS (but 
> > perhaps I'm wrong).
> > > >>>>> As sjoshi@bluearc mentioned, the LAYOUTCOMMIT 
> > provides a synchronization
> > > >>>>> point, so even if the non-clustered server does not 
> > want to update
> > > >>>>> metadata on every DS I/O, the LAYOUTCOMMIT could also 
> > be a trigger to
> > > >>>>> execute whatever synchronization mechanism the 
> > implementer wishes to put
> > > >>>>> in the control protocol.
> > > >>>>
> > > >>>> As far as I'm aware, there are no exceptions in 
> > RFC5661 that would allow
> > > >>>> pNFS servers to break the rule that any visible change 
> > to the data must
> > > >>>> be atomically accompanied with a change attribute update.
> > > >>>>
> > > >>>
> > > >>> Trond, I'm not sure how this rule you mentioned is specified.
> > > >>>
> > > >>> See more in section 12.5.4 and 12.5.4.1. LAYOUTCOMMIT 
> > and change/time_modify
> > > >>> in particular:
> > > >>>
> > > >>>    For some layout protocols, the storage device is 
> > able to notify the
> > > >>>    metadata server of the occurrence of an I/O; as a 
> > result, the change
> > > >>>    and time_modify attributes may be updated at the 
> > metadata server.
> > > >>>    For a metadata server that is capable of monitoring 
> > updates to the
> > > >>>    change and time_modify attributes, LAYOUTCOMMIT 
> > processing is not
> > > >>>    required to update the change attribute.  In this 
> > case, the metadata
> > > >>>    server must ensure that no further update to the 
> > data has occurred
> > > >>>    since the last update of the attributes; file-based 
> > protocols may
> > > >>>    have enough information to make this determination 
> > or may update the
> > > >>>    change attribute upon each file modification.  This 
> > also applies for
> > > >>>    the time_modify attribute.  If the server 
> > implementation is able to
> > > >>>    determine that the file has not been modified since the last
> > > >>>    time_modify update, the server need not update time_modify at
> > > >>>    LAYOUTCOMMIT.  At LAYOUTCOMMIT completion, the 
> > updated attributes
> > > >>>    should be visible if that file was modified since 
> > the latest previous
> > > >>>    LAYOUTCOMMIT or LAYOUTGET
> > > >>
> > > >> I know. However the above paragraph does not state that 
> > the server
> > > >> should make those changes visible to clients other than 
> > the one that is
> > > >> writing.
> > > >>
> > > >> Section 18.32.4 states that writes will cause the 
> > time_modified and
> > > >> change attributes to be updated (if and only if the file data is
> > > >> modified). Several other sections rely on this 
> > behaviour, including
> > > >> section 10.3.1, section 11.7.2.2, and section 11.7.7.
> > > >>
> > > >> The only 'special behaviour' that I see allowed for pNFS 
> > is in section
> > > >> 13.10, which states that clients can't expect to see changes
> > > >> immediately, but that they must be able to expect close-to-open
> > > >> semantics to work. Again, if this is to be the case, 
> > then the server
> > > >> _must_ be able to deal with the case where client 1 dies 
> > before it can
> > > >> issue the LAYOUTCOMMIT.
> > > 
> > > Agreed.
> > > 
> > > >>
> > > >>
> > > >>>> As I see it, if your server allows one client to read 
> > data that may have
> > > >>>> been modified by another client that holds a WRITE 
> > layout for that range
> > > >>>> then (since that is a visible data change) it should 
> > provide a change
> > > >>>> attribute update irrespective of whether or not a 
> > LAYOUTCOMMIT has been
> > > >>>> sent.
> > > >>>
> > > >>> the requirement for the server in WRITE's 
> > implementation section 
> > > >>> is quite weak: "It is assumed that the act of writing 
> > data to a file will
> > > >>> cause the time_modified and change attributes of the 
> > file to be updated."
> > > >>>
> > > >>> The difference here is that for pNFS the written data 
> > is not guaranteed
> > > >>> to be visible until LAYOUTCOMMIT.  In a broader sense, 
> > assuming the clients
> > > >>> are caching dirty data and use a write-behind cache, 
> > application-written data
> > > >>> may be visible to other processes on the same host but 
> > not to others until
> > > >>> fsync() or close() - open-to-close semantics are the 
> > only thing the client
> > > >>> guarantees, right?  Issuing LAYOUTCOMMIT on fsync() and 
> > close() ensure the
> > > >>> data is committed to stable storage and is visible to 
> > all other clients in
> > > >>> the cluster.
> > > >>
> > > >> See above. I'm not disputing your statement that 'the 
> > written data is
> > > >> not guaranteed to be visible until LAYOUTCOMMIT'. I am 
> > disputing an
> > > >> assumption that 'the written data may be visible without 
> > an accompanying
> > > >> change attribute update'.
> > > > 
> > > > 
> > > > In other words, I'd expect the following scenario to give the same
> > > > results in NFSv4.1 w/pNFS as it does in NFSv4:
> > > 
> > > That's a strong requirement that may limit the scalability 
> > of the server.
> > > 
> > > The spirit of the pNFS operations, at least from Panasas 
> > perspective was that
> > > the data is transient until LAYOUTCOMMIT, meaning it may or 
> > may not be visible
> > > to clients other than the one who wrote it, and its 
> > associated metadata MUST
> > > be updated and describe the new data only on LAYOUTCOMMIT 
> > and until then it's
> > > undefined, i.e. it's up to the server implementation 
> > whether to update it or not.
> > > 
> > > Without locking, what do the stronger semantics buy you?
> > > Even if a client verified the change_attribute new data may 
> > become visible
> > > at any time after the GETATTR if the file/byte range aren't locked.
> > 
> > There is no locking needed in the scenario below: it is ordinary
> > close-to-open semantics.
> > 
> > The point is that if you remove the one and only way that clients have
> > to determine whether or not their data caches are valid, then they can
> > no longer cache data at all, and server scalability will be shot to
> > smithereens anyway.
> > 
> > Trond
> > 
> > > Benny
> > > 
> > > > 
> > > > Client 1			Client 2
> > > > ========			========
> > > > 
> > > > OPEN foo
> > > > READ
> > > > CLOSE
> > > > 				OPEN
> > > > 				LAYOUTGET ...
> > > > 				WRITE via DS
> > > > 				<dies>...
> > > > OPEN foo
> > > > verify change_attr
> > > > READ if above WRITE is visible
> > > > CLOSE
> > > > 
> > > > Trond
> > > > _______________________________________________
> > > > nfsv4 mailing list
> > > > nfsv4@xxxxxxxx
> > > > https://www.ietf.org/mailman/listinfo/nfsv4
> > 
> > 
> > _______________________________________________
> > nfsv4 mailing list
> > nfsv4@xxxxxxxx
> > https://www.ietf.org/mailman/listinfo/nfsv4
> > 
> > 


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux