Re: [PATCH 18/19] index-helper: autorun

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2016-03-17 at 15:43 +0100, Johannes Schindelin wrote:
> Hi Duy,
> 
> On Thu, 17 Mar 2016, Duy Nguyen wrote:
> 
> > On Thu, Mar 17, 2016 at 1:27 AM, Johannes Schindelin
> > <Johannes.Schindelin@xxxxxx> wrote:
> > > I am much more concerned about concurrent accesses and the
> > > communication
> > > between the Git processes and the index-helper. Writing to the
> > > .pid file
> > > sounds very fragile to me, in particular when multiple processes
> > > can poke
> > > the index-helper in succession and some readers are unaware that
> > > the index
> > > is being refreshed.
> > 
> > It's not that bad.
> 
> Well, the way I read the code it is possible that:
> 
> 1. Git process 1 starts, reading the index
> 2. Git process 2 starts, poking the index-helper
> 3. The index-helper updates the .pid file (why not set a bit in the
> shared
>    memory?) with a prefix "W"
> 4. Git process 2 reads the .pid file and waits for the "W" to go away
>    (what if index-helper is not fast enough to write the "W"?)
> 5. Git process 1 access the index, happily oblivious that it is being
>    updated and the data is in an inconsistent state


That's not quite how I understand it.  It's more like MVCC.  Writes to
the index go to a new index file.  Index files are identified by their
SHA.  Reads from the index go into a new shm, identified by SHA.  

The "W" is set only once -- it just means "this index helper knows how
to talk to watchman".  It's a compile-time option.

(I'm going to change this anyway when I switch to named pipes).

The watchman data is shared independently; if it's not ready in time
(whatever that means -- it's 1s in the current code), then read-cache
should fall back to brute-force checking every file.

> > We should have protection in place to deal with this and fall back
> > to
> > reading directly from file when things get suspicious.
> 
> I really want to prevent that. I know of use cases where the index
> weighs
> 300MB, and falling back to reading it directly *really* hurts.





> > But I agree that sending UNIX signals (or PostMessage) is not
> > really
> > good communication.
> 
> Yeah, I really would like two-way communication instead. Named pipes?
> They'd have the advantage that you could use the full path to the
> index as
> identifier.
> 
> The way I read the current code, we would actually create a different
> shared memory every time the index changes because its checksum is
> part of
> the shared memory's "path"...
> 
> Ciao,
> Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]