On Dec 5, 2008, at 12:29 PM, J. Bruce Fields wrote:
On Fri, Dec 05, 2008 at 11:38:24AM -0500, bfields wrote:
On Fri, Dec 05, 2008 at 08:26:44AM -0500, Steve Dickson wrote:
J. Bruce Fields wrote:
I think it would still be valuable to replace the 'sync' with two
'fsync's, one of the file, one on the directory.
Sure, may as well.--b.
Something similar to this:
diff -up nfs-utils/utils/statd/sm-notify.c.orig nfs-utils/utils/
statd/sm-notify.c
--- nfs-utils/utils/statd/sm-notify.c.orig 2008-11-17
15:06:13.000000000 -0500
+++ nfs-utils/utils/statd/sm-notify.c 2008-12-05
08:21:52.000000000 -0500
@@ -211,12 +211,6 @@ usage: fprintf(stderr,
backup_hosts(_SM_DIR_PATH, _SM_BAK_PATH);
get_hosts(_SM_BAK_PATH);
- /* If there are not hosts to notify, just exit */
- if (!hosts) {
- nsm_log(LOG_DEBUG, "No hosts to notify; exiting");
- return 0;
- }
This was still a huge boot-time win in the common case, so now that
we've committed to it I'd rather not regress. Let's just skip the
sync()s/fsncy()s in the !hosts case--that looks to me like the
simplest
correct solution for now.
My argument for correctness: if we don't sync in that case, then on
reboot the rename that updates the state will either have happened or
(if a crash comes too soon) not.
It is OK for that update to not happen as long as we're assured it
happens before the first lock request is made or replied to, or the
first monitor request completes, as, in the absence of any notifies,
those are the only points at which the new state will be exposed to
the
outside world.
The first lock request will also require an upcall to statd. So we're
OK as long as any monitor requests (from either the local kernel or
remote peers) do a sync.
And statd should be doing a sync before responding to any monitor
request. As long as the SM_DIR is on the same filesystem as the state
file, that would do the job.... But now that I look, I see statd is
using an open with O_SYNC to ensure the new statd record hits stable
storage. Which we can't count on being enough.
How about adding an explicit fsync() of the state file (and parent
directory) to statd's first succesful creation of a statd record,
together with a comment explaining this? So around about line 194 in
utils/statd/monitor.c:sm_mon_1_svc()?
In fact, we could delete the sync entirely and do the same before the
first notification, and then we wouldn't have to wait for the sync in
the case host records are present either.... (statd would, but
perhaps
we could still get other work done in the mean time).
(Am I missing something?)
This all might work, but I think we're adding a lot of complexity as a
workaround. Someone should fix the real problem, which is the
implementation of sync().
That would probably have other benefits too.
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html