> On Oct 7, 2020, at 10:34 AM, Frank Filz <ffilzlnx@xxxxxxxxxxxxxx> wrote: > >> -----Original Message----- >> From: J. Bruce Fields [mailto:bfields@xxxxxxxxxxxx] >> Maybe I overlooked the obvious: if Chrome holds a lock on that file when you >> suspend, and if you stay in suspend for longer than the NFSv4 lease time (default >> 90 seconds), then the client will lose its lease, hence any file locks. I think these >> days the client then returns EIO on any further IO to that file descriptor. >> >> Maybe there's some way to turn off that locking as a workaround. >> >> The simplest thing we can do to help might be implementing "courteous server" >> behavior: instead of automatically removing locks after a client's lease expires, >> it can wait until there's an actual lock conflict. That might be enough for your >> case. >> >> There's been a little planning done and it's not a big project, but I don't think it's >> actually at the top of anyone's todo list right now, so I'm not sure when that will >> get done. > > I've had courtesy locks on my back burner for Ganesha though I hadn't thought about that there might actually be an important practical issue. We've found that instantly bringing the hammer down on NFSv4 leases has negative operational consequences in environments where minutes-long network partitions are part of life. Extending the lease period impacts the length an NFS server is in grace after a reboot, so it's not always a good solution. > Does any other server implement them? If we suggest this as a solution to the Chrome suspend issue, it might be good to assure that the major server vendors implement this. We think OnTAP does, at least. > There is a problem with the courtesy locks for this solution though... The clientid is still going to be expired, and the locks are associated with the clientid, so unless we allow courtesy re-instatement of expired clientids, courtesy locks don't actually solve the problem... An NFSv4 server is not required to expire a lease after the lease period expires. A courteous server would simply allow a conflicting lock request to take an expired lock after a client's lease expired. If no conflicting lock operations occur, then the missing client could come back and find its lease state intact (unless of course the server has restarted or purged the lease for other reasons). Oracle has an open design document that can be posted here for more comment and review. We agree that this is much better server behavior and would like more server implementations to adopt it. > Option - use NFSv3 instead :-) The lack of lock expiry due to AWOL client would work in a suspended client's favor... Note also that a suspended client could be a VM, for example, VirtualBox allows saving and suspending a VM in running state. > > Interesting problem... > > Frank > -- Chuck Lever