On Thu, Aug 1, 2019 at 11:41 AM Olga Kornievskaia <olga.kornievskaia@xxxxxxxxx> wrote: > > On Thu, Aug 1, 2019 at 11:13 AM J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > > > On Thu, Aug 01, 2019 at 10:12:11AM -0400, Olga Kornievskaia wrote: > > > On Wed, Jul 31, 2019 at 5:51 PM J. Bruce Fields <bfields@xxxxxxxxxx> wrote: > > > > > > > > On Wed, Jul 31, 2019 at 05:10:01PM -0400, Olga Kornievskaia wrote: > > > > > I'm having difficulty with this patch because there is no good way to > > > > > know when the copy_notify stateid can be freed. What I can propose is > > > > > to have the linux client send a FREE_STATEID with the copy_notify > > > > > stateid and use that as the trigger to free the state. In that case, > > > > > I'll keep a reference on the parent until the FREE_STATEID is > > > > > received. > > > > > > > > > > This is not in the spec (though seems like a good idea to tell the > > > > > source server it's ok to clean up) so other implementations might not > > > > > choose this approach so we'll have problems with stateids sticking > > > > > around. > > > > > > > > https://tools.ietf.org/html/rfc7862#page-71 > > > > > > > > "If the cnr_lease_time expires while the destination server is > > > > still reading the source file, the destination server is allowed > > > > to finish reading the file. If the cnr_lease_time expires > > > > before the destination server uses READ or READ_PLUS to begin > > > > the transfer, the source server can use NFS4ERR_PARTNER_NO_AUTH > > > > to inform the destination server that the cnr_lease_time has > > > > expired." > > > > > > > > The spec doesn't really define what "is allowed to finish reading the > > > > file" means, but I think the source server should decide somehow whether > > > > the target's done. And "hasn't sent a read in cnr_lease_time" seems > > > > like a pretty good conservative definition that would be easy to > > > > enforce. > > > > > > "hasn't send a read in cnr_lease_time" is already enforced. > > > > > > The problem is when the copy did start in normal time, it might take > > > unknown time to complete. If we limit copies to all be done with in a > > > cnr_lease_time or even some number of that, we'll get into problems > > > when files are large enough or network is slow enough that it will > > > make this method unusable. > > > > No, I'm just suggesting that if it's been more than cnr_lease_time since > > the target server last sent a read using this stateid, then we could > > free the stateid. > > That's reasonable. Let me do that. Now that I need a global list for the copy_notify stateids, do you have a preference for either to keep it of the nfs4_client structure or the nfsd_net structure? I store async copies under the nfs4_client structure but the laundromat traverses things in nfsd_net structure. > > > > > Worst case, if the network goes down for a couple minutes and > > > > the target tries to pick up a copy where it left off, it'll get > > > > PARTNER_NO_AUTH. I assume that results in the same error being returned > > > > the client, at which point the client knows that the copy_notify stateid > > > > may have installed and can do what it chooses to recover (like send a > > > > new copy_notify). > > > > > > Yes the client recovers but the cost of setting up the source server > > > to destination is huge so any retries would kill the performance. > > > > In the rare case when the server goes an entire cnr_lease_time between > > reads, the performance hit of recovery won't be an issue. > > > > > > The FREE_STATEID might also be a good idea, but I guess we can't count > > > > on it. > > > > > > > > Maybe the spec could use some errata to clarify that FREE_STATEID is > > > > allowed on copy_notify stateids, that clients should send it when > > > > they're done, and that servers are allowed to expire copy_notify > > > > stateid's even after their first use. > > > > > > FREE_STATEID is for a stateid > > > > The discussion of FREE_STATEID in 4.1 says "The FREE_STATEID operation > > is used to free a stateid that no longer has any associated locks > > (including opens, byte-range locks, delegations, and layouts)." A > > clarification that it can be used for any stateid would be nice. (Is > > that true? Do we want it for COPY stateid's too?) > > We don't need it for the COPY stateids as there is a OFFLOAD_CANCEL if > the client wants to stop, otherwise, the destination server has no > problems with knowing when to free the copy stateid. > > > > > --b. > > > > > which a copy_notify (or copy) stateid is so I don't see anything that > > > really needs any extra stating. > > > > > > I think what's needed is specifying that for COPY_NOTIFY a client must > > > do a FREE_STATEID when its done with a stateid.