> On Nov 27, 2023, at 11:36 AM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > > On Mon, Nov 27, 2023 at 03:28:16PM +0000, Tao Lyu wrote: >> >> O_APPEND | O_DIRECT can be used to bypass the client cache for multiple threads writing data without caring of the orders (e.g., logs). >> >> Yes, to support O_APPEND | O_DIRECT, NFS must first support APPEND. >> But the key point is that looks like NFS has supported O_APPEND already. >> I can successfully open a file with "O_RDWR|O_APPEND". >> >> My confusion is why NFS supports O_RDWR and O_APPEND individually but does not support this combination. O_DIRECT is supposed to not depend on any cached information, including the file size, which the client needs to know to form an NFS WRITE with the correct offset to ensure it is an appending write. File sizes are managed on the server, so the server needs to know that the client is requesting an appending write so it knows where to put the payload. > Well, it does support O_RDWR|O_APPEND, just not with O_DIRECT? > > Btw, I think an APPEND operation in NFS would be a very good idea, and > I'd love to work with interested parties in the IETF on it. You can write and submit a personal draft that describes it; it wouldn't need to be more than a few pages. The hard part of that would be accumulating use case descriptions. I think you could create a proof of concept by including a VERIFY operation in front of the WRITE to ensure the WRITE occurs only if the offset argument in the WRITE agrees with the file's size on the server. If the VERIFY fails, the client grabs the updated file size and tries again. > Not that > we (Damien to be specific) plan to add support to Linux to also report > the actual offset an O_APPEND write wrote to through io_uring as we > have varios use cases for out of place write data stores for that. > It would be great to also support that programming model over NFS. -- Chuck Lever