RE: Zombie / Orphan open files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> From: Jeff Layton <jlayton@xxxxxxxxxx>
 
> What do you mean by "zombie / orphan" here? Do you mean files that have
> been sillyrenamed [1] to ".nfsXXXXXXX" ? Or are you simply talking about
> clients that are holding files open for a long time?

Hi Jeff

.... clients that are holding files open for a long time

Here's a complete summary:

On my NAS appliances , I noticed that average usage of the relevant memory pool
never went down. I suspected some sort of "leak" or "file-stuck-open" scenario.

I hypothesized that if NFS-client to NFS-server communications were frequently disrupted,
this would explain the memory-pool behavior I was seeing.
I felt that Kerberos credential expiration was the most likely frequent disruptor.

I ran a simple python test script that (1) opened enough files that I could see an obvious jump
in the relevant NAS memory pool metric, then (2) went to sleep for shorter than the
Kerberos ticket lifetime, then (3) exited without explicitly closing the files.
 The result:  After the script exited,  usage of the relevant server-side memory pool decreased by
the expected amount.

Then I ran a simple python test script that (1) opened enough files that I could see an obvious jump
in the relevant NAS memory pool metric, then (2) went to sleep for longer than the
Kerberos ticket lifetime, then (3) exited without explicitly closing the files.
 The result:  After the script exited,  usage of the relevant server-side memory pool did not decrease.
( the files opened by the script were permanently "stuck open" ... depleting the server-side pool resource)

In a large campus environment, usage of the relevant memory pool will eventually get so
high that a server-side reboot will be needed.

I'm working with my NAS vendor ( who is very helpful ); however, if the NFS server and client specifications
don't specify an official way to handle this very real problem, there is not much a NAS server vendor can safely / responsibly do.

If there currently is no formal/official way of handling this issue ( server-side pool exhaustion due to "disappearing" client )
is this a problem worth solving ( at a level lower than the application level )?

If client applications were all well behaved ( didn't leave files open for long periods of time ) we wouldn't have a significant issue.
Assuming applications aren't going to be well behaved, are there good general ways of solving this on either the client or server side ?

Thanks

Andy






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux