Re: nfs problem

Eyal Lebedinsky <fedora@xxxxxxxxxxxxxx> · Fri, 2 Jun 2017 22:07:03 +1000

On 02/06/17 21:48, Roger Heflin wrote:
If the machine mounting the file and doing the tail has read from the
file and there is new data added in that last block and because of the
rate the data is coming into the file the timestamp on the file does
not change then the client nfs host will not know that the last block
has changed and will not know to reread it (it is already in cache).
If it is this bug/feature nfs has worked this way I think pretty much
forever at a larger scale (2 hosts each writing every other block, if
the timestamp does not change then each node will see the others
blocks as empty because of cache, at least until the timestamp changes
from what it knows it wrote).  The trick my previous job implemented
was to make sure the timestamp on the file moved ahead at least one
second so that the clients knew the file changed.  but if tail is
actively reading it while things are getting written into it I don't
see a way it would be able to work that well.

What you are describing sounds like a variant of this issue.

Thanks Roger,

Interesting, though I wonder why it worked very well until the latest kernel
series (3.10/3.11) which started showing the problem. Looks like a new "feature"
to me.

BTW, the server is also the time server and the two are well synchronised. When
a zero block shows up it can take a minute or two before the real data shows up.
I use 'less' to view the file, hit refresh (Shift-G) and soon a line of zeroes
comes along. I kept refreshing for a few minutes until the good data shows.

When I originally notices the problem (a monitoring script started showing garbage),
the monitored file was updated once a minute and it needed to be updated two or
three times before the real data was exported.

which I consider rather a long time for a file to present wrong content (over nfs).

Maybe there is an export (or mount) option I can use?

Also, I could not find a reference to this problem when I investigated the issue
initially, and as such I assumed it is my setup. But the server (f19) had no
updates or changes for a long while. It is clearly the new kernels exposing this,
and I tested more that one client machine to verify that they also show the issue.

Eyal

On Fri, Jun 2, 2017 at 5:36 AM, Eyal Lebedinsky <fedora@xxxxxxxxxxxxxx> wrote:
On 16/04/17 17:13, Eyal Lebedinsky wrote:

I asked on AskFedora and got no response. Hoping this list is more active.

https://ask.fedoraproject.org/en/question/104010/nfs-showing-bad-data-f24/

I last (17/04) reported the problem on redhat bugzilla
        https://bugzilla.redhat.com/show_bug.cgi?id=1442797
I got no reaction since. I updated the report with a test script and also
verified I have the same
problem using the latest f25 (4.11.3-100.fc25). See the details there.

Is the bugzilla active at all?

Does anyone have a server running an old kernel (3.8, 3.9) that they can
test against?
I would like to confirm that this is not just my own setup having the issue.

TIA

--
Eyal Lebedinsky (eyal@xxxxxxxxxxxxxx)

--
Eyal Lebedinsky (fedora@xxxxxxxxxxxxxx)

--
Eyal Lebedinsky (fedora@xxxxxxxxxxxxxx)
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx