Re: BAD_SEQID drops state_owner, but lock stateid can still be found

Benjamin Coddington <bcodding@xxxxxxxxxx> · Thu, 19 Mar 2015 06:13:04 -0400 (EDT)

On Wed, 18 Mar 2015, Benjamin Coddington wrote:

> I'm working on a RHEL6 bug where the client gets stuck in a
> WRITE,BAD_STATEID loop forever, and it looks like what happens is the write
> continually picks a delegated write lock stateid in nfs4_select_rw_stateid()
> for the write which never gets cleaned up in the state machine because a
> previous OPEN during recovery had a BAD_SEQID come in which dropped the
> state_owner.
>
> Does it make sense to try to find lock stateids and set NFS_LOCK_LOST if
> we're going to drop the state_owner?
>
> It may be entirely impossible to reproduce the problem upstream, but I
> figured I'd ask while I try..

So, I'm wrong about what's actually happening here -- so please ignore me
for the time being.  I didn't see that a local lock on a write delegation
doesn't have a ls_stateid - which makes perfect sense.  I still think
there's a problem, but this description of it is inaccurate..

TL;DR - Ben bashes head against state machine, gets it wrong.

Ben
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html