On 2/10/21 10:55 AM, Jeff King wrote:
On Tue, Feb 09, 2021 at 12:39:22PM -0500, Jeff Hostetler wrote:
Yeah, in my version two processes could still create uniquely named
sockets and then do the rename trick. But they capture the inode
number of the socket before they do that. They periodically lstat
the socket to see if the inode number has changed and if so, assume
it has been stolen from them. (A bit of a hack, I admit.)
OK, that makes more sense. I saw the mention of the inode stuff in a
comment, but I didn't see it in the code (I guess if it's a periodic
check it's not in that initial socket creation function).
Yeah, there's a very slow poll(2) loop in the listen/accept thread
that watches for that and new connections (and quit messages).
And I was assuming that 2 servers starting at about the same time
are effectively equivalent -- it doesn't matter which one dies, since
they both should have the same amount of cached state. Unlike the
case where a long-running server (with lots of state) is replaced by
a newcomer.
Yeah, I agree with that notion in general. I do think it would be easier
to reason about if the creation were truly race-proof (probably with a
dot-lock; see below), rather than the later "check if we got replaced"
thing. OTOH, that "check" strategy covers a variety of cases (including
that somebody tried to ping us, decided we weren't alive due to a
timeout or some other system reason, and then replaced our socket).
Another strategy there could be having the daemon just decide to quit if
nobody contacts it for N time units. It is, after all, a cache. Even if
nobody replaces the socket, it probably makes sense to eventually decide
that the memory we're holding isn't going to good use.
I have the poll(2) loop set to recheck the inode for theft every 60
seconds (randomly chosen).
Assuming the socket isn't stolen, I want to leave any thoughts of
an auto-shutdown to the application layer above it. My next patch
series will use this ipc mechanism to build a FSMonitor daemon that
will watch the filesystem for changes and then be able to quickly
respond to a `git status`, so it is important that it be allowed to
run without any clients for a while (such a during a build). Yes,
memory concerns are important, so I do want it to auto-shutdown if
the socket is stolen (or the workdir is deleted).
I am wondering if we can use the .LOCK file magic to our advantage
here (in sort of an off-label use). If we have the server create a
lockfile "<path>.LOCK" and if successful leave it open/locked for the
life of the server (rather than immediately renaming it onto <path>)
and let the normal shutdown code rollback/delete the lockfile in the
cleanup/atexit.
If the server successfully creates the lockfile, then unlink+create
the socket at <path>.
I don't even think this is off-label. Though the normal use is for the
.lock file to get renamed into place as the official file, there are a
few other places where we use it solely for mutual exclusion. You just
always end with rollback_lock_file(), and never "commit" it.
So something like:
1. Optimistically see if socket "foo" is present and accepting
connections.
2. If not, then take "foo.lock". If somebody else is holding it, loop
with a timeout waiting for them to come alive.
3. Assuming we got the lock, then either unlink+create the socket as
"foo", or rename-into-place. I don't think it matters that much
which.
4. Rollback "foo.lock", unlinking it.
Then one process wins the lock and creates the socket, while any
simultaneous creators spin in step 2, and eventually connect to the
winner.
That would give us the unique/exclusive creation (on the lock) that
we need. Then wrap that with all the edge case cleanup code to
create/delete/manage the peer socket. Basically if the lock exists,
there should be a live server listening to the socket (unless there
was a crash...).
I think you'd want to delete the lock as soon as you're done with the
setup. That reduces the chances that a dead server (e.g., killed by a
power outage without the chance to clean up after itself) leaves a stale
lock sitting around.
Thanks this helps. I've got a version now that is a slight variation on
what you have here that seems to work nicely and has the short-lived
lock file. I'll post this shortly.
Jeff