Re: Read/Write NFS I/O performance degraded by FLUSH_STABLE page flushing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Apr 30, 2009, at 4:12 PM, Brian R Cowan wrote:

Hello all,

This is my first post, so please be gentle.... I have been working with a
customer who is attempting to build their product in ClearCase dynamic
views on Linux. When they went from Red hat Enterprise Linux 4 (update 5) to Red Hat Enterprise Linux 5 (Update 2), their build performance degraded
dramatically. When troubleshooting the issue, we noticed that links on
RHEL 5 caused an incredible number of "STABLE" 4kb nfs writes even though the storage we were writing to was EXPLICITLY mounted async. (This made
RHEL 5 nearly 5x slower than RHEL 4.5 in this area...)

On consultation with some internal resources, we found this change in the
2.6 kernel:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=ab0a3dbedc51037f3d2e22ef67717a987b3d15e2

In here it looks like the NFS client is forcing sync writes any time a
write of less than the NFS write size occurs. We tested this hypothesis by
setting the write size to 2KB. The "STABLE" writes went away and link
times came back down out of the stratosphere. We built a modified kernel based on the RHEL 5.2 kernel (that ONLY backed out of this change) and we got a 33% improvement in overall build speeds. In my case, I see almost identical build times between the 2 OS's when we use this modified kernel
on RHEL 5.

Now, why am I posing this to the list? I need to understand *why* that
change was made. On the face of it, simply backing out that patch would be perfect. I'm paranoid. I want to make sure that this is the ONLY reason:
"/* For single writes, FLUSH_STABLE is more efficient */ "

It seems more accurate to say that they *aren't* more efficient, but
rather are "safer, but slower."

They are more efficient from the point of view that only a single RPC is needed for a complete write. The WRITE and COMMIT are done in a single request.

I don't think the issue here is whether the write is stable, but it is whether the NFS client has to block the application for it. A stable write that is asynchronous to the application is faster than WRITE +COMMIT.

So it's not "stable" that is holding you up, it's "synchronous." Those are orthogonal concepts.

I know that this is a 3+ year old update, but RHEL 4 is based on a 2.4
kernel,

Nope, RHEL 4 is 2.6.9.  RHEL 3 is 2.4.20-ish.

and SLES 9 is based on something in the same ballpark. And our
customers see problems when they go to SLES 10/RHEL 5 from the prior major
distro version.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux