Re: nfs cache bug (when server delete the file ,nfs client can read file also)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Myklebust, Trond 写道:
> On Fri, 2012-11-23 at 13:23 +0800, fanchaoting wrote:
>> Myklebust, Trond 写道:
>>> On Fri, 2012-11-23 at 12:18 +0800, fanchaoting wrote:
>>>> Hi,everyone. I found a big bug abount nfs cache.
>>>>
>>>> when server delete the file ,nfs client can read file also.
>>>>
>>>> the following is the reproduce.
>>>>
>>>> ip: 192.168.0.19  nfs-client
>>>> ip: 192.168.0.20  nfs-client
>>>> ip: 192.168.0.21  nfs-server
>>>>
>>>> ############################################################################
>>>>
>>>> /usr/bin/ssh -n 192.168.0.21 service nfs start
>>>> /usr/bin/ssh -n 192.168.0.21 /usr/sbin/rpc.idmapd
>>>> /usr/bin/ssh -n 192.168.0.19 /usr/sbin/rpc.idmapd
>>>> /usr/bin/ssh -n 192.168.0.20 /usr/sbin/rpc.idmapd
>>>> /usr/bin/ssh -n 192.168.0.21 "rm -rf /nfsroot; mkdir -p /nfsroot"
>>>> /usr/bin/ssh -n 192.168.0.21 /usr/sbin/exportfs -au
>>>> /usr/bin/ssh -n 192.168.0.21 /usr/sbin/exportfs -i -o insecure,no_root_squash,rw,fsid=0 *:/nfsroot
>>>> /usr/bin/ssh -n 192.168.0.19 test -d /nfsroot || (rm -rf /nfsroot; mkdir -p /nfsroot)
>>>> /usr/bin/ssh -n 192.168.0.20 test -d /nfsroot || (rm -rf /nfsroot; mkdir -p /nfsroot)
>>>>
>>>> /usr/bin/ssh -n 192.168.0.19 umount /nfsroot
>>>> /usr/bin/ssh -n 192.168.0.20 umount /nfsroot
>>>> cmd="echo \"hello world\" > /nfsroot/tmpfile"
>>>> /usr/bin/ssh -n 192.168.0.21 $cmd
>>>> /usr/bin/ssh -n 192.168.0.21 mkdir /nfsroot/tmpdir
>>>> /usr/bin/ssh -n 192.168.0.21 touch /nfsroot/tmpdir/tmpdfile
>>>> /usr/bin/ssh -n 192.168.0.19 mount -t nfs4 192.168.0.21:/ /nfsroot
>>>> /usr/bin/ssh -n 192.168.0.20 mount -t nfs4 192.168.0.21:/ /nfsroot
>>>> /usr/bin/ssh -n 192.168.0.21 cat  /nfsroot/tmpfile
>>>> /usr/bin/ssh -n 192.168.0.21 ls -l /nfsroot/tmpdir
>>>> /usr/bin/ssh -n 192.168.0.19 cat  /nfsroot/tmpfile
>>>> /usr/bin/ssh -n 192.168.0.19 ls -l  /nfsroot/tmpdir
>>>> /usr/bin/ssh -n 192.168.0.20 cat  /nfsroot/tmpfile
>>>> /usr/bin/ssh -n 192.168.0.20 ls -l  /nfsroot/tmpdir
>>>> /usr/bin/ssh -n 192.168.0.19 cat /nfsroot/tmpfile > /dev/null
>>>> /usr/bin/ssh -n 192.168.0.20 cat /nfsroot/tmpfile > /dev/null
>>>> /usr/bin/ssh -n 192.168.0.21 rm -rf /nfsroot/tmpfile
>>>> echo -e "sleep 60~~~~~~~~\n"
>>>> sleep 60
>>>>
>>>> #############################################################################
>>>>
>>>> last: In 192.168.0.19 I do:
>>>>
>>>> #cat /nfsroot/tmpfile           <--the nfs server delete the file,but nfs client can read the file
>>>>  hello world
>>>>
>>>>
>>>> I think  when the nfs server delete the file ,
>>>> the  server should notice the nfs client,
>>>> but the upstream kernel does't this.
>>> So is this a problem with the client or the server? In other words, if
>>> you use a different server/client combination, do you see a different
>>> result?
>>>
>> I think that the server has the problem .when the server deletes the file ,
>> it should notice the client immediately.
> 
> There is no notification mechanism in NFS; on open(), the client is
> supposed to revalidate its cached information and the server is supposed
> to return an ESTALE error if the filehandle is no longer valid. Either
> one of these 2 mechanisms (client revalidation or server reply) could be
> going wrong here, which is why I'm asking.

I found J. Bruce Fields's patch(break delegations on unlink)  maybe solve this problem .
But I did't found it in the upstream kernel.

I think when the server delete the file, it should reply a DELEGRETURN to the nfs client.


> 
>>> Which kernels are you using in your tests for the client and server?
>>>
>> I found the problem in kernel  3.7.0-rc6
> 
> OK. Do you have a non-Linux server or client available that you can use
> for testing? Alternatively, could you use wireshark to capture a dump of
> the NFS traffic between the client and server?

In the wireshark, I found the right traffic should have open operation,
but now i can't find the open operation.

I found the function nfs4_open_prepare has problem , in the old kernel , 
it can  run 

...snip...

  rpc_call_start(task);

...snip...


> 

Attachment: nfscache_cache_wrong.dump
Description: Binary data

Attachment: nfscache_cache_right.dump
Description: Binary data


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux