On Thu, Jan 18, 2018 at 1:16 AM, Jonathan Nieder <jrnieder@xxxxxxxxx> wrote: > Hi, > > Duy Nguyen wrote: >> On Wed, Jan 17, 2018 at 4:42 AM, Brandon Williams <bmwill@xxxxxxxxxx> wrote: > >>> IIUC Split index is an index extension >>> that can be enabled to limit the size of the index file that is written >>> when making changes to the index. It breaks the index into two pieces, >>> index (which contains only changes) and sharedindex.XXXXX (which >>> contains unchanged information) where 'XXXXX' is a value found in the >>> index file. If we don't do anything fancy then these two files live >>> next to one another in a repository's git directory at $GIT_DIR/index >>> and $GIT_DIR/sharedindex.XXXXX. This seems to work all well and fine >>> except that this isn't always the case and the read_index_from function >>> takes this into account by enabling a caller to specify a path to where >>> the index file is located. We can do this by specifying the index file >>> we want to use by setting GIT_INDEX_FILE. > [...] >>> In this case if i were to specify a location of an >>> index file in my home directory '~/index' and be using the split index >>> feature then the corresponding sharedindex file would live in my >>> repository's git directory '~/project/.git/sharedindex.XXXXX'. So the >>> sharedindex file is always located relative to the project's git >>> directory and not the index file itself, which is kind of confusing. >>> Maybe a better design would be to have the sharedindex file located >>> relative to the index file. >> >> That adds more problems. Now when you move the index file around you >> have to move the shared index file too (think about atomic rename >> which we use in plenty of places, we can't achieve that by moving two >> files). A new dependency to $GIT_DIR is not that confusing to me, the >> index file is useless anyway if you don't have access to >> $GIT_DIR/objects. There was always the option to _not_ split the index >> when $GIT_INDEX_FILE is specified, I think I did consider that but I >> dropped it because we'd lose the performance gain by splitting. > > Can you elaborate a little more on this? > > At first glance, it seems simpler to say "paths in index extensions > named in the index file are relative to the location of the index > file" and to make moving the index file also require moving the shared > index file, exactly as you say. So at least from a "principle of > least surprise" perspective I would be tempted to go that way. > > It's true that we rely on atomic rename in plenty of places, but only > within a directory. (Filesystem boundaries, NFS, etc mean that atomic > renames across directories are a lost cause.) > > Fortunately index files (including temp index files used by scripts) > tend to only be in $GIT_DIR, for exactly that reason. So I am > wondering if switching to index-file-relative semantics would be an > invasive move and what the pros and cons of such a move are. I think it gets messier. Now you have to move two files. If the first move succeeds but the second one fails, recovery may involve un-move the first file, but its old content is already gone. We probably can get around that. But since the shared index is assumed big and heavy, I just went with "store it in the place it's going to be and never move it anywhere ever (until nobody uses it then it's deleted)" -- Duy