On 20 May 2019, at 9:12, Benjamin Coddington wrote:
On 18 May 2019, at 22:15, Xuewei Zhang wrote:
On Sat, May 18, 2019 at 5:09 AM Benjamin Coddington
<bcodding@xxxxxxxxxx> wrote:
On 17 May 2019, at 17:45, Xuewei Zhang wrote:
Seems this patch introduced a bug in how lock protocol handles
GRANTED_MSG in nfs.
Yes, you're right: it's broken, and broken badly because we find
conflicting
locks based on lockd's fl_pid and lockd's fl_owner, which is
current->files.
That means that clients are not differentiated, and that means that
v3 locks
are broken.
Thanks a lot for the quick response and confirming the problem!
I'd really like to see the fl_pid value make sense on the server
when we
show it to userspace, so I think that we should stuff the svid in
fl_owner.
Clearly I need to be more careful making changes here, so I am going
to take
my time fixing this, and I won't get to it until Monday. A revert
would get
us back to safe behavior.
From my limited understanding, b8eee0e90f97 ("lockd: Show pid of
lockd
for remote locks")
exists only for fixing lockd in 9d5b86ac13c5 ("fs/locks: Remove
fl_nspid and use fs-specific...").
But I don't see anything wrong in 9d5b86ac13c5 ("fs/locks: Remove
fl_nspid and use fs-specific..."). Could you let me know what's the
problem? Thanks a lot!
If 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific...")
is correct, we
probably don't need to add another fixing patch. Perhaps reverting
b8eee0e90f97
("lockd: Show pid of lockd for remote locks") would be the best way
then.
I think we have an existing problem: the NLM server is setting
fl_owner to
current->files and (before the bad patch) fl_pid to svid.
That means that we can't tell the difference between locks from
different
clients that may have the same svid. The bad patch just made the
problem
far more likely to occur, that's what you're now noticing.
Ok, I just noticed that we set fl_owner to the nlm_host in
nlm4svc_retrieve_args, so things are not as dire as I thought. What
would
be nice is a sane set of tests for NLM..
Since we already were placing the nlm_host in fl_owner, I think
reverting
9d5b86ac13c5 at this point is the proper thing to do.
Ben