On Wed, 2019-02-27 at 12:52 -0500, David Noveck wrote: > > However note > > that the counter argument to what you state above is that _if_ the > > server requires a layoutcommit before it will acknowledge a file > size > > change, then pNFS is likely to underperform for applications such > as > > databases or VMs where each record is required to be written in > stable > > mode. > > IOW: If all writes that need to be stable are also required to be > > acknowledged with a layoutcommit (to the MDS), > > But it is not true that *all* writes that need to be stable are also > required > to be acknowledged with a layoutcommit (to the MDS. Only those that > potentially change the file size require this. That's true for POSIX O_DSYNC writes, but it is not true for O_SYNC. In the latter case, the timestamps are required to be updated synchronously as well, which implies a layoutcommit. > > > then your ability to > > scale out your server will be in doubt > > For many applications, particularly databases, it will easy to make > sure > that the writes that potentially change the file size are few and far > between. If the database uses O_DSYNC, yes. > > On Tue, Feb 26, 2019 at 8:12 PM Trond Myklebust < > trondmy@xxxxxxxxxxxxxxx> wrote: > > On Wed, 2019-02-27 at 00:13 +0000, Rick Macklem wrote: > > > Trond Myklebust wrote: > > > [stuff snipped] > > > > Please see the Errata ID 2751 > > > > http://www.rfc-editor.org/errata/eid2751 > > > > > > I'll admit I hadn't seen this errata before. However, it seems to > > be > > > specific to > > > the File Layout. For the Flexible File Layout... > > > > > > When I look in RFC-8435, I cannot find anything that states that > > a > > > LayoutCommit > > > is only required for case(s) where a Commit to the Storage Server > > is > > > required. > > > Sec. 2.1 > > > Clearly states that a Commit to the Storage Server is required > > > before the client > > > does a LayoutCommit when the write(s) were not done FILE_SYNC. > > > However, I do not see any indication that the LayoutCommit is > > not > > > to be done > > > for the case where the write(s) are done FILE_SYNC. > > > > > > FF_FLAGS_NO_LAYOUTCOMMIT can be used to indicate to a client that > > > LayoutCommits are not required, but this does not be dependent on > > how > > > the write(s) to the Storage Server were done. > > > > > > The only way a Flexible File layout Metadata server can know what > > the > > > current file size is (when a read/write layout is issued to a > > client) > > > is to do a > > > Getattr to the Storage Server. > > > If a client is not required to do a LayoutCommit when the > > write(s) to > > > the > > > Storage Server are done FILE_SYNC, then the Metadata server must > > do > > > Getattr RPCs to the Storage Server whenever it needs an up to > > date > > > file size > > > if a read/write layout is issued to a client. > > > > > > This can result in a lot of overhead that can be avoided by > > requiring > > > the > > > LayoutCommit to be done by a client after writing to a Storage > > > Server, > > > irrespective of the need for a Commit to the Storage Server. > > > As such, I would rather not have this errata applied to RFC-8435. > > > > > > > Fair enough. I agree that the errata in question only applies to > > the > > pNFS files layout, however you were talking about RFC5661 and > > whether > > or not we were interpreting that correctly. Since RFC5661 only > > refers > > to about the behaviour of the pNFS files layout, then I assumed > > that > > was what you were referring to. > > > > For flexfiles we may have a bug in the layoutcommit case. However > > note > > that the counter argument to what you state above is that _if_ the > > server requires a layoutcommit before it will acknowledge a file > > size > > change, then pNFS is likely to underperform for applications such > > as > > databases or VMs where each record is required to be written in > > stable > > mode. > > IOW: If all writes that need to be stable are also required to be > > acknowledged with a layoutcommit (to the MDS), then your ability to > > scale out your server will be in doubt. > > -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx