On 18 May 2019, at 22:15, Xuewei Zhang wrote:
On Sat, May 18, 2019 at 5:09 AM Benjamin Coddington
<bcodding@xxxxxxxxxx> wrote:
On 17 May 2019, at 17:45, Xuewei Zhang wrote:
Seems this patch introduced a bug in how lock protocol handles
GRANTED_MSG in nfs.
Yes, you're right: it's broken, and broken badly because we find
conflicting
locks based on lockd's fl_pid and lockd's fl_owner, which is
current->files.
That means that clients are not differentiated, and that means that
v3 locks
are broken.
Thanks a lot for the quick response and confirming the problem!
I'd really like to see the fl_pid value make sense on the server when
we
show it to userspace, so I think that we should stuff the svid in
fl_owner.
Clearly I need to be more careful making changes here, so I am going
to take
my time fixing this, and I won't get to it until Monday. A revert
would get
us back to safe behavior.
From my limited understanding, b8eee0e90f97 ("lockd: Show pid of lockd
for remote locks")
exists only for fixing lockd in 9d5b86ac13c5 ("fs/locks: Remove
fl_nspid and use fs-specific...").
But I don't see anything wrong in 9d5b86ac13c5 ("fs/locks: Remove
fl_nspid and use fs-specific..."). Could you let me know what's the
problem? Thanks a lot!
If 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific...")
is correct, we
probably don't need to add another fixing patch. Perhaps reverting
b8eee0e90f97
("lockd: Show pid of lockd for remote locks") would be the best way
then.
I think we have an existing problem: the NLM server is setting fl_owner
to
current->files and (before the bad patch) fl_pid to svid.
That means that we can't tell the difference between locks from
different
clients that may have the same svid. The bad patch just made the
problem
far more likely to occur, that's what you're now noticing.
What needs to happen is that we generate our own fl_owner_t based on the
nlm_host and svid, so that we end up with unique fl_owner for each
client/svid pair, the same way that nlmclnt does. That way the nlm
server
can let the kernel do the lock matching based on unique fl_owner for
each
client/svid.
The mech in clntproc.c for all this can probably be shared, so I'll try
to
make that common code.
Jeff, any words of wisdom?
Ben