RE: Zombie / Orphan open files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Chuck Lever III <chuck.lever@xxxxxxxxxx>
> 
> Almost. The protocol requires:
> 
> After the client reboots, when it opens its first file, the client
> does a SETCLIENTID or EXCHANGE_ID to establish its lease on the
> server. All OPEN and LOCK state is managed under the umbrella of
> that lease (and that includes all files that client is managing).
> The client keeps the lease alive by renewing the lease every minute.
> 
> If the client reboots (ie, does a subsequent SETCLIENTID or
> EXCHANGE_ID with a new boot verifier), the server has to purge all
> open file state for that client.
> 
> If the client fails to renew its lease, the server is free to do
> what it wants -- it can purge the client's lease immediately, or
> it can wait until conflicting opens or locks come from other
> clients and then purge some or all of that client's lease.
> 
> If the client can't or doesn't CLOSE that file, it will remain
> on the server until the client tells it (implicitly by not
> renewing or explicitly with a fresh ID) that the state is no
> longer needed; or until the server reboots and the client does
> not re-establish the OPEN state.

So , in general, this is true:
  - A lease is not "issued" for every file opened
  - A lease is not "issued" for every user running on an NFS-client host
  - In general. one lease is issued / managed for each NFS-client host
( if this is true,  my server vendor is probably not forgetting to do
  something they should be doing )


> But again, we need some way to confirm exactly how this is
> happening. Can you post your script, or capture client-server
> network traffic while the script does its thing?
> 

The script is about simple as "hello world":

import sys
import fileinput
import os.path
import re
import time

def main():

   StartID=int(raw_input("Enter Start ID: "))

   TestDir=os.path.normcase('/nashome/r/romero/stuckopentest/dataout')

   FPlist=[]

   # open 2000 files and leave them open
   for x in range(StartID, StartID+2000):

      TestFilePath=os.path.join(TestDir, "TestFile-" + str(x))
      print(TestFilePath)

      # open file append file pointer to list
      FPlist.append(open(TestFilePath,"w"))



   # sleep for greater than Krb ticket life time
   # 2000 files will be "stuck open" on the server
   time.sleep(60*60*24)


main()


NOTE:

I don't expect people on this list to debug my issue.

My reason's for posting:

- Determine If my NAS vendor might be accidentally
  not doing something they should be.
  (  I now don't really think this is the case. )


- Determine if this is a known behavior common to all NFS implementations
   ( Linux, ....etc ) and if so have you determine if this is a problem that should be addressed
   in the spec and the implementations.  






























[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux