On Wed, 2021-06-30 at 11:22 -0400, J. Bruce Fields wrote: > On Tue, Jun 29, 2021 at 01:51:43PM +0000, Chuck Lever III wrote: > > > > > > > On Jun 29, 2021, at 9:48 AM, Olga Kornievskaia <aglo@xxxxxxxxx> > > > wrote: > > > > > > On Tue, Jun 29, 2021 at 8:58 AM Chuck Lever III > > > <chuck.lever@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > On Jun 28, 2021, at 6:06 PM, Trond Myklebust > > > > > <trondmy@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > > On Mon, 2021-06-28 at 16:23 -0400, Olga Kornievskaia wrote: > > > > > > Hi folks, > > > > > > > > > > > > I have a general question of why the client doesn't throw > > > > > > away the > > > > > > cached server's capabilities on server reboot. Say a client > > > > > > mounted a > > > > > > server when the server didn't support security_labels, then > > > > > > the > > > > > > server > > > > > > was rebooted and support was enabled. Client re-establishes > > > > > > its > > > > > > clientid/session, recovers state, but assumes all the old > > > > > > capabilities > > > > > > apply. A remount is required to clear old/find new > > > > > > capabilities. The > > > > > > opposite is true that a capability could be removed (but > > > > > > I'm assuming > > > > > > that's a less practical example). > > > > > > > > > > > > I'm curious what are the problems of clearing server > > > > > > capabilities and > > > > > > rediscovering them on reboot? Is it because a local > > > > > > filesystem could > > > > > > never have its attributes changed and thus a network file > > > > > > system > > > > > > can't > > > > > > either? > > > > > > > > > > > > Thank you. > > > > > > > > > > In my opinion, the client should aim for the absolute minimum > > > > > overhead > > > > > on a server reboot. The goal should be to recover state and > > > > > get I/O > > > > > started again as quickly as possible. > > > > > > > > I 100% agree with the above. However... > > > > > > > > > > > > > Detection of new features, etc > > > > > can wait until the client needs to restart. > > > > > > > > A server reboot can be part of a failover to a different > > > > server. I > > > > think capability discovery needs to happen as part of server > > > > reboot > > > > recovery, it can't be optimized away. > > > > > > Can you clarify what you mean by a "failover to a different > > > server"? > > > > IP-based failover means that a server can crash, and its partner > > can > > detect that and take over the IP address and exports of the failed > > server. The replacement server doesn't have to have exactly the > > same > > set of capabilities. > > So it could also lose capabilities? > > I'm a little nervous about server features being changed out from > under > the client while the client has the server mounted. > > But, I don't know, looking quickly through the list of NFS_CAP_* > definitions in nfs_fs_sb.h, I'm not coming up with a case where we > couldn't handle it, maybe it's OK. > > --b. I'm not taking any patches for the server reboot case. If someone wants to do it for the migration case, then fine: that's not a case that is common or that requires performance. However reprobing all mounted filesystems on every server reboot is NACKed. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx