Re: Session timeout on RHEL6.2

Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> · Sun, 25 Dec 2011 14:25:08 +0100

On Sun, 2011-12-25 at 14:03 +0200, Benny Halevy wrote: 
> On 2011-12-25 11:47, Trond Myklebust wrote:
> > On Sun, 2011-12-25 at 06:37 +0200, Benny Halevy wrote: 
> >> On 2011-12-21 22:11, Tigran Mkrtchyan wrote:
> >>> On Wed, Dec 21, 2011 at 2:57 PM, Trond Myklebust
> >>> <Trond.Myklebust@xxxxxxxxxx> wrote:
> >>>> On Wed, 2011-12-21 at 10:24 +0100, Tigran Mkrtchyan wrote:
> >>>>> Dear friends,
> >>>>>
> >>>>> We are observing strange behavior with RHEL 6.2:
> >>>>>
> >>>>> Our the server lease time is 90 seconds. I can see that client
> >>>>> sends SEQUENCE every 60 sec. And this is for some hours ( ~8 ).
> >>>>> At some point client sends SEQUENCE after 127 seconds and
> >>>>> gets, as expected, EXPIRED.
> >>>>
> >>>> Why shouldn't the client be allowed to let the lease expire if nothing
> >>>> is using that filesystem?
> >>>>
> >>>>> I this point I have to blame myself.
> >>>>> Client comes with EXCHANGE_ID using the same clientid.
> >>>>> We did not garbage collected clientid internally as this happens after
> >>>>> 2*LEASE_TIME
> >>>>> and return EXPIRE. This ping-pong never ends.
> >>>>>
> >>>>> This is probably mostly a bug on my side. Nevertheless we never observed late
> >>>>> SEQUENCE with kernel > 2.6.39. A short packet dump attached.
> >>>>>
> >>>>> I can open bug at RHEL if required.
> >>>>
> >>>> I wouldn't consider that a bug.
> >>>
> >>> As I said, there is a bug in exchange_id processing ( case 3 ) on my
> >>> side. But to me it's sounds strange that client after more than 8
> >>> hours of sending only sequence decided to send one of them later than
> >>> lease time. Especially, that we did not have it with other kernels.
> >>
> >> I'm inclined to agree.  The client can let the lease expire for sure
> >> and that's not a bug but the fact that the client sent the SEQUENCE operation
> >> after the lease had expired indicates it might not be aware of that fact
> >> and that seems to be a client bug.
> >>
> >> That said, I don't think that letting the lease expire when the client is idle
> >> is the most polite thing to do. Why let the server clean up after the client
> >> and revert to possibly un-optimized recovery paths rather than orderly
> >> destruction of the state by the client?
> > 
> > There are plenty of cases where the client can be idle for hours or even
> > _days_. What's the point of pinging the server all the time after
> > working hours?
> > 
> > If someone wants to code up a DESTROY_SESSION and DESTROY_CLIENTID in
> > order to make it formal, then fine, however note that we don't even do
> > that on a full unmount today.
> > 
> 
> The heavy lifting is releasing locks and returning layouts and delegations
> sending DESTROY_{SESSION,CLIENTID} would be nice to have but I don't think
> it's the most important issue.

Actually, that requirement to return state is what makes
DESTROY_CLIENTID a completely useless operation.
Forget what I said then: it's too stupid to implement...

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@xxxxxxxxxx
www.netapp.com

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html