Peter Staubach wrote:
Doug Hughes wrote:
Peter Staubach wrote:
J. Bruce Fields wrote:
On Thu, Aug 28, 2008 at 01:27:53PM -0700, Andrew Morton wrote:
(switched to email. Please respond via emailed reply-to-all, not
via the
bugzilla web interface).
On Thu, 28 Aug 2008 11:41:08 -0700 (PDT)
bugme-daemon@xxxxxxxxxxxxxxxxxxx wrote:
NFS client writes to Sun Solaris 10 U4 server. at some point in
time, there is an empty portion of the output file from the
writer containing missing data (shows as NULL bytes from another
NFS client
issuing a tail -f on the file being written). confirmed that the
file as exists on the NFS server is sparse, missing bytes
(not necessarily multiple of 512 or 1024, one sample is a gap of
3818 bytes,
another is 1895 bytes, another is 423 bytes)
Seems like something that could happen if for example two write rpc's
got reordered on the network. That's not necessarily a bug--the nfs
client isn't required to wait for confirmation of every previous write
before sending the next one.
if two RPCs got reordered on the network, and they encompass all the
data, then there shouldn't be any missing data. It seems to me like
pieces of data are just being skipped, for whatever reason, but I
haven't exhaustively examined the NFS network data.
However if the client isn't flushing dirty data to the server before
returning from close, then that's a violation of NFS's close-to-open
semantics:...
this is not confirmed yet. No solid cases of data not being present
after close.
if you do a read of the entire file from the NFS client doing the
writing, it
causes the non-flushed writes to be instantly flushed to the
server followed by
a NFS3 commit operation. The data then can be seen on all other
NFS clients.
If you do an open of the file alone, no flush
if you do an open and a close, no flush
... so this "close, no flush" could be a bug (depending on who is
doing
that close when--I don't completely understand the described
situation).
I suspect that this last might depend upon 1) what options were used
when the file system was mounted and 2) how the file was opened. The
flush-on-close wouldn't be needed if the file was opened read-only.
no special options on open. Here are the mount options:
retry=1000,tcp,noatime,nosuid,nodev,dirsync,timeo=100,rsize=32768,wsize=32768
,hard,intr
It seems a little odd that the holes aren't page aligned or page
sized multiples.
indeed. and the time for them to actually get to the server is
indeterminate (days is not uncommon. We have not as yet confirmed
that some of the data never gets sent to the server until close)
What application is being used to generate the file which is showing
these holes?
namd and some custom code developed in-house for chemistry research
(at the very least)
Do these applications use mmap() or generate the file contents
serially or randomly?
Thanx...
open file at beginning. write, write, write, write, write, (no seek, no
offset, entirely serial), run a very long time, end.
strace excerpt:
16:42:56.143512 write(8, "1948900 47.1225 0 0 0 47.7759 0 "..., 118) = 118
16:43:01.845742 write(8, "1949000 47.0474 0 0 0 47.8865 0 "..., 116) = 116
16:43:07.481889 write(8, "1949100 47.045 0 0 0 48.0742 0 0"..., 116) = 116
16:43:13.150555 write(8, "1949200 47.1848 0 0 0 47.8868 0 "..., 116) = 116
16:43:18.788863 write(8, "1949300 47.251 0 0 0 47.7743 0 0"..., 113) = 113
16:43:24.429424 write(8, "1949400 47.2722 0 0 0 47.6937 0 "..., 118) = 118
16:43:30.057179 write(8, "1949500 47.4865 0 0 0 47.6251 0 "..., 117) = 117
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html