On Thu, Aug 01, 2019 at 10:12:11AM -0400, Olga Kornievskaia wrote: > On Wed, Jul 31, 2019 at 5:51 PM J. Bruce Fields <bfields@xxxxxxxxxx> wrote: > > > > On Wed, Jul 31, 2019 at 05:10:01PM -0400, Olga Kornievskaia wrote: > > > I'm having difficulty with this patch because there is no good way to > > > know when the copy_notify stateid can be freed. What I can propose is > > > to have the linux client send a FREE_STATEID with the copy_notify > > > stateid and use that as the trigger to free the state. In that case, > > > I'll keep a reference on the parent until the FREE_STATEID is > > > received. > > > > > > This is not in the spec (though seems like a good idea to tell the > > > source server it's ok to clean up) so other implementations might not > > > choose this approach so we'll have problems with stateids sticking > > > around. > > > > https://tools.ietf.org/html/rfc7862#page-71 > > > > "If the cnr_lease_time expires while the destination server is > > still reading the source file, the destination server is allowed > > to finish reading the file. If the cnr_lease_time expires > > before the destination server uses READ or READ_PLUS to begin > > the transfer, the source server can use NFS4ERR_PARTNER_NO_AUTH > > to inform the destination server that the cnr_lease_time has > > expired." > > > > The spec doesn't really define what "is allowed to finish reading the > > file" means, but I think the source server should decide somehow whether > > the target's done. And "hasn't sent a read in cnr_lease_time" seems > > like a pretty good conservative definition that would be easy to > > enforce. > > "hasn't send a read in cnr_lease_time" is already enforced. > > The problem is when the copy did start in normal time, it might take > unknown time to complete. If we limit copies to all be done with in a > cnr_lease_time or even some number of that, we'll get into problems > when files are large enough or network is slow enough that it will > make this method unusable. No, I'm just suggesting that if it's been more than cnr_lease_time since the target server last sent a read using this stateid, then we could free the stateid. > > Worst case, if the network goes down for a couple minutes and > > the target tries to pick up a copy where it left off, it'll get > > PARTNER_NO_AUTH. I assume that results in the same error being returned > > the client, at which point the client knows that the copy_notify stateid > > may have installed and can do what it chooses to recover (like send a > > new copy_notify). > > Yes the client recovers but the cost of setting up the source server > to destination is huge so any retries would kill the performance. In the rare case when the server goes an entire cnr_lease_time between reads, the performance hit of recovery won't be an issue. > > The FREE_STATEID might also be a good idea, but I guess we can't count > > on it. > > > > Maybe the spec could use some errata to clarify that FREE_STATEID is > > allowed on copy_notify stateids, that clients should send it when > > they're done, and that servers are allowed to expire copy_notify > > stateid's even after their first use. > > FREE_STATEID is for a stateid The discussion of FREE_STATEID in 4.1 says "The FREE_STATEID operation is used to free a stateid that no longer has any associated locks (including opens, byte-range locks, delegations, and layouts)." A clarification that it can be used for any stateid would be nice. (Is that true? Do we want it for COPY stateid's too?) --b. > which a copy_notify (or copy) stateid is so I don't see anything that > really needs any extra stating. > > I think what's needed is specifying that for COPY_NOTIFY a client must > do a FREE_STATEID when its done with a stateid.