Hi Bruce,
On 27.09.2021 17:53, J. Bruce Fields wrote:
On Mon, Sep 27, 2021 at 08:10:31AM +0200, Salvatore Bonaccorso wrote:
We recently got the following traces on a NFS server, but I'm not sure
how to further debug this, any hints?
The server creates and opens a file in two steps, though it should
really be a single atomic operation.
That means there's a small possibility somebody could intervene and do
something like change the permissions:
[5746893.904448] ------------[ cut here ]------------
[5746893.910050] nfsd4_process_open2 failed to open newly-created
file! status=10008
10008 is NFS4ERR_DELAY, so maybe somebody managed to get a delegation
before we finished opening?
We should be able to prevent that....
In your setup are there processes quickly opening new files created by
others?
This is very possible. The NFS server is used as a "scratch" place
accessible from
compute cluster where people can have multiple jobs simultaneously
running through
Slurm and accessing the data. So it is possible that user create new
files from
one running instance and accessing it quickly from the other nodes.
I'm so far was unable to arificially trigger the issue but is there
anything I
can try out to get more information useful for you?
Regards,
Salvatore