J. Bruce Fields <bfields <at> fieldses.org> writes: > > On Tue, May 12, 2015 at 12:37:10AM +0200, gianpietro.sella <at> unipd.it wrote: > > > On Sun, May 10, 2015 at 11:28:25AM +0200, gianpietro.sella <at> unipd.it wrote: > > >> Hi, sorry for my bad english. > > >> I testing nfs cluster active/passsive (2 nodes). > > >> I use the next instruction for nfs: > > >> > > >> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/htm l/High_Availability_Add-On_Administration/s1-resourcegroupcreatenfs-HAAA.htm l > > >> > > >> I use centos 7.1 on the nodes. > > >> The 2 node of the cluster share the same iscsi volume. > > >> The nfs cluster is very good. > > >> I have only one problem. > > >> I mount the nfs cluster exported folder on my client node (nfsv3 > > >> protocol). > > >> I write on the nfs folder an big data file (70GB): > > >> dd if=/dev/zero bs=1M count=70000 > /Instances/output.dat > > >> Before write is finished I put the active node in standby status. > > >> then the resource migrate in the other node. > > >> when the dd write finish the file is ok. > > >> I delete the file output.dat. > > > > > > So, the dd and the later rm are both run on the client, and the rm after > > > the dd has completed and exited? And the rm doesn't happen till after > > > the first migration is completely finished? What version of NFS are you > > > using? > > > > > > It sounds like a sillyrename problem, but I don't see the explanation. > > > > > > --b. > > > > > > Hi Bruce, thank for your answer. > > yes the dd command and the rm command (all on the client node) finish > > without error. > > I use nfsv3, but is the same with nfsv4 protocol. > > the s.o. is centos 7.1, the nfs package is nfs-utils-1.3.0-0.8.el7.x86_64. > > the pacemaker configuration is: > > > > pcs resource create nfsclusterlv LVM volgrpname=nfsclustervg > > exclusive=true --group nfsclusterha > > > > pcs resource create nfsclusterdata Filesystem > > device="/dev/nfsclustervg/nfsclusterlv" directory="/nfscluster" > > fstype="ext4" --group nfsclusterha > > > > pcs resource create nfsclusterserver nfsserver > > nfs_shared_infodir=/nfscluster/nfsinfo nfs_no_notify=true --group > > nfsclusterha > > > > pcs resource create nfsclusterroot exportfs > > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > > directory=/nfscluster/exports fsid=0 --group > > nfsclusterha > > > > pcs resource create nfsclusternova exportfs > > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > > directory=/nfscluster/exports/nova fsid=1 -- > > group nfsclusterha > > > > pcs resource create nfsclusterglance exportfs > > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > > directory=/nfscluster/exports/glance fsid= > > 2 --group nfsclusterha > > > > pcs resource create nfsclustervip IPaddr2 ip=192.168.61.180 cidr_netmask=24 > > --group nfsclusterha > > > > pcs resource create nfsclusternotify nfsnotify source_host=192.168.61.180 > > --group nfsclusterha > > > > now I have done the next test. > > nfs cluster with 2 node. > > the first node in standby state. > > the second node in active state. > > I mount the empty (not used space) exported volume in the client with nfsv3 > > protocol (with nfs4 protocol is the same). > > I write on the client an big file (70GB) in the mount directory with dd (but > > is the same with cp command). > > while the command write the file I disable nfsnotify, Iaddr2, exportfs and > > nfsserver resource in this order (pcs resource disable ...) and next I > > enable the resource (pcs resource enable ...) in the reverse order. > > when disable resource writing freeze, when enable resource writing restart > > without error. > > when the writing command is finished I delete the file. > > the mount directory is empty and the used space of exported volume is 0, > > this is ok. > > now i repead the test. > > but now I disable/enable even the Filesystem resource: > > disable nfsnotify, Iaddr2, exportfs, nfsserver and Filesystem resource > > (writing freeze) then enable in the reverse order (writing restart without > > error). > > when writing command is finished I delete the file. > > now the mounted directory is empty (not file) but the used space is not 0 > > but is 70GB. > > this is not ok. > > now I execute the next command on the active node of the cluster where the > > volume is exported with nfs: > > mount -o remount /dev/nfsclustervg/nfsclusterlv > > where /dev/nfsclustervg/nfsclusterlv is the exported volume (iscsi volume > > configured with lvm). > > after this command the used space in the mounted directory of the client is > > 0, this is ok. > > I think that the problem is the Filesystem resource on the active node of > > the cluster. > > but is very strange. > > So, the only difference between the "good" and "bad" cases was the > addition of the stop/start of the filesystem resource? I assume that's > equivalent to an umount/mount. yes is correct > > I guess the server's dentry for that file is hanging around for a little > while for some reason. We've run across at least one problem of that > sort before (see d891eedbc3b1 "fs/dcache: allow d_obtain_alias() to > return unhashed dentries"). > > In both cases after the restart the first operation the server will get > for that file is a write with a filehandle, and it will have to look up > that filehandle to find the file. (Whereas without the restart the > initial discovery of the file will be a lookup by name.) > > In the "good" case the server already has a dentry cached for that file, > in the "bad" case the umount/mount means that we'll be doing a > cold-cache lookup of that filehandle. > > I wonder if the test case can be narrowed down any further.... Is the > large file necessary? If it's needed only to ensure the writes are > actually sent to the server promptly then it might be enough to do the > nfs mount with -osync. I use sync options, is the same problem > > Instead of the cluster migration or restart, it might be possible to > reproduce the bug just with a > > echo 2 >/proc/sys/vm/drop_caches > > run on the server side while the dd is in progress--I don't know if that > will reliably drop the one dentry, though. Maybe do a few of those in a > row. no, with echo command is not possible reproduce the problem > > --b. > -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster