> One last reply to myself. One of the test cases my test scripts triggered turned out to actually be due to my NFS RW mount options. OLD RW NFS mount options: "rw,noatime,nocto,actimeo=3600,lookupcache=all,nolock,tcp,vers=3" NEW options that work better rw,noatime,nolock,tcp,vers=3" I had copied the RO NFS options we use which try to be aggressive about caching. The RO root image doesn't change much and we want it as fast as possible. The options are not appropriate for RW areas that change. (Even though it's a single image file we care about). So now my test scripts run clean but since what we see on larger systems is right after reboot, the caching shouldn't matter. In the real problem case, the RW stuff is done once after reboot. FWIW I attached my current test scripts, my last batch had some errors. The search continues for the actual problem, which I'm struggling to reproduce @ 366 NFs clients. I believe yesterday, when I posted about actual HANGS, that is the real problem we're tracking. I hit that once in my test scripts - only once. My script was otherwise hitting a "file doesn't really exist even though cached" issue and it was tricking my scripts. In any case, I'm changing the RW NFS options we use regardless. Erik
Attachment:
nfs-issues.tar.xz
Description: application/xz
________ Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/441850968 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users