On Tue, May 14, 2024 at 5:36 PM Frank Filz <ffilzlnx@xxxxxxxxxxxxxx> wrote: > > > > On May 14, 2024, at 2:56 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote: > > > > > > Hi folks, > > > > > > Given that not everything for NFSv3 has a specification, I post a > > > question here (as it concerns linux v3 (client) implementation) but I > > > ask a generic question with respect to NOTIFY sent by an NFS server. > > > > There is a standard: > > > > https://pubs.opengroup.org/onlinepubs/9629799/chap11.htm > > > > > > > A NOTIFY message that is sent by an NFS server upon reboot has a > > > monitor name and a state. This "state" is an integer and is modified > > > on each server reboot. My question is: what about state value > > > uniqueness? Is there somewhere some notion that this value has to be > > > unique (as in say a random value). > > > > > > Here's a problem. Say a client has 2 mounts to ip1 and ip2 (both > > > representing the same DNS name) and acquires a lock per mount. Now say > > > each of those servers reboot. Once up they each send a NOTIFY call and > > > each use a timestamp as basis for their "state" value -- which very > > > likely is to produce the same value for 2 servers rebooted at the same > > > time (or for the linux server that looks like a counter). On the > > > client side, once the client processes the 1st NOTIFY call, it updates > > > the "state" for the monitor name (ie a client monitors based on a DNS > > > name which is the same for ip1 and ip2) and then in the current code, > > > because the 2nd NOTIFY has the same "state" value this NOTIFY call > > > would be ignored. The linux client would never reclaim the 2nd lock > > > (but the application obviously would never know it's missing a lock) > > > --- data corruption. > > > > > > Who is to blame: is the server not allowed to send "non-unique" state > > > value? Or is the client at fault here for some reason? > > > > The state value is supposed to be specific to the monitored host. If the client is > > indeed ignoring the second reboot notification, that's incorrect behavior, IMO. > > If you are using multiple server IP addresses with the same DNS name, you may want to set: > > sysctl fs.nfs.nsm_use_hostnames=0 > > The NLM will register with statd using the IP address as name instead of host name. Then your two IP addresses will each have a separate monitor entry and state value monitored. In my setup I already have this set to 0. But I'll look around the code to see what it is supposed to do. > > Frank >