> -----Original Message----- > From: Chuck Lever III <chuck.lever@xxxxxxxxxx> > > Almost. The protocol requires: > > After the client reboots, when it opens its first file, the client > does a SETCLIENTID or EXCHANGE_ID to establish its lease on the > server. All OPEN and LOCK state is managed under the umbrella of > that lease (and that includes all files that client is managing). > The client keeps the lease alive by renewing the lease every minute. > > If the client reboots (ie, does a subsequent SETCLIENTID or > EXCHANGE_ID with a new boot verifier), the server has to purge all > open file state for that client. > > If the client fails to renew its lease, the server is free to do > what it wants -- it can purge the client's lease immediately, or > it can wait until conflicting opens or locks come from other > clients and then purge some or all of that client's lease. > > If the client can't or doesn't CLOSE that file, it will remain > on the server until the client tells it (implicitly by not > renewing or explicitly with a fresh ID) that the state is no > longer needed; or until the server reboots and the client does > not re-establish the OPEN state. So , in general, this is true: - A lease is not "issued" for every file opened - A lease is not "issued" for every user running on an NFS-client host - In general. one lease is issued / managed for each NFS-client host ( if this is true, my server vendor is probably not forgetting to do something they should be doing ) > But again, we need some way to confirm exactly how this is > happening. Can you post your script, or capture client-server > network traffic while the script does its thing? > The script is about simple as "hello world": import sys import fileinput import os.path import re import time def main(): StartID=int(raw_input("Enter Start ID: ")) TestDir=os.path.normcase('/nashome/r/romero/stuckopentest/dataout') FPlist=[] # open 2000 files and leave them open for x in range(StartID, StartID+2000): TestFilePath=os.path.join(TestDir, "TestFile-" + str(x)) print(TestFilePath) # open file append file pointer to list FPlist.append(open(TestFilePath,"w")) # sleep for greater than Krb ticket life time # 2000 files will be "stuck open" on the server time.sleep(60*60*24) main() NOTE: I don't expect people on this list to debug my issue. My reason's for posting: - Determine If my NAS vendor might be accidentally not doing something they should be. ( I now don't really think this is the case. ) - Determine if this is a known behavior common to all NFS implementations ( Linux, ....etc ) and if so have you determine if this is a problem that should be addressed in the spec and the implementations.