On Fri, Jul 30, 2010 at 10:07:27PM -0400, Jason Keltz wrote: > On 28/07/2010 1:42 PM, J. Bruce Fields wrote: > >On Wed, Jul 28, 2010 at 09:44:48AM -0400, Jason Keltz wrote > >My list of NFS exports has been gradually growing over the years. > >Right now, for example, my home directories are exported to around > >800 hosts. (although only a relatively small subset of those will > >mount at the same time...). I used to just add hosts to > >/etc/exports on the file server, and run "exportfs -r", and > >everything would be fine. New systems would be able to mount > >everything perfectly, and existing systems would not be affected at > >all. As the list has grown, I've been noticing a problem. Now, when > >I run exportfs -r, there is an approximate 7-10 second hang on the > >systems that have already mounted the share, and then everything > >returns to normal. This doesn't happen *while* exportfs -r is > >running, but just after it exits. I figured that maybe exportfs was > >"unexporting"/re-exporting to hosts that already had the share in > >use which might have caused the problem, so I tried to manually > >add/remove hosts thinking that this would only affect those hosts, > >but it did not. Exporting to one new host still causes the hang on > >all existing hosts. > > > >Since I have multiple exports to all of the hosts, adding one new > >host can hang things for a while. I can see that reducing the list > >of exports, or hosts would reduce the delay. What I am wondering is > >if there is a better way that I can add hosts without affecting > >connectivity to existing hosts? > > > >The NFS server itself is pretty powerful -- dual quad core box, lots > >of memory, many NFS threads, exclusive NFS server, etc... I am > >running an older RHEL4 release though, so it would have an older > >kernel/NFS system. Maybe this issue has been solved in newer > >releases. > >There have been fixes in this area, though I don't see any that I'm sure > >would address your problem. If you could test with the latest nfs-utils > >(ideally, with the latest nfs-utils and kernel) and let us know the > >result, that would be helpful. > > > >The -t option to rpc.mountd (may need a newer nfs-utils?) may also help. > > > >Also worth filing an RHEL bug. > > Hi Bruce, > > I backported the -t option to RHEL4 by looking at the latest > nfs-utils, but it didn't fix the problem. > I'm having trouble compiling the latest nfs-utils for RHEL4 because > a couple of changed libraries... > > What I have learned: > > 1) whether exportfs -r, or manually add a single host with exportfs, > or even remove a host with exportfs -u, the delay to all the clients > is the same. The delay doesn't change depending on the share. > 2) the delay doesn't happen while exportfs is running. It happens > immediately afterwards, and when it does happen, an strace of > rpc.mountd shows that rpc.mountd is busy resolving every single > hostname in etab.. on one of our NFS servers, this means a total of > 13,000 DNS requests... on another system, that's over 30,000 DNS > requests (and around a 30 second delay to all shares). Once > rpc.mountd stops burdening the DNS, that's exactly when activity on > all the shares returns. > 3) I've tried to change /etc/exports to use just IP... but exportfs > happily switches etab back to using hostnames, and then mountd does > all the lookups again... > > I suppose that the reason why exportfs doesn't convert etab to just > use IPs in the first place is because a name can resolve to multiple > IPs... but if I start with a list of IPs in /etc/exports, it would > be nice if they just stayed like that in etab, and if mountd could > use them as is... what's the point of all the DNS requests? (first > to generate etab, then from mountd a second time!) > > The only thing I can think to try at this point would be to see if I > populated /etc/hosts locally on the file server to see if the timing > works better than the DNS requests. > > If someone has any suggestions, I'd love to hear them. Did you ever figure out anything more about the problem? --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html