Re: [PATCH v2 1/2] tmp-objdir: new API for creating temporary writable databases

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Dec 05, 2021 at 11:43:07PM -0800, Junio C Hamano wrote:
> "Neeraj Singh via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
> 
> > @@ -331,10 +332,14 @@ static void update_relative_gitdir(const char *name,
> >  				   void *data)
> >  {
> >  	char *path = reparent_relative_path(old_cwd, new_cwd, get_git_dir());
> > +	struct tmp_objdir *tmp_objdir = tmp_objdir_unapply_primary_odb();
> >  	trace_printf_key(&trace_setup_key,
> >  			 "setup: move $GIT_DIR to '%s'",
> >  			 path);
> > +
> >  	set_git_dir_1(path);
> 
> If a blank line needs to be added, have it between the variable
> declarations and the first statement (i.e. before the above call to
> "trace_printf_key()").
> 

Will fix.

> > +	if (tmp_objdir)
> > +		tmp_objdir_reapply_primary_odb(tmp_objdir, old_cwd, new_cwd);
> >  	free(path);
> >  }
> 
> This is called during set_git_dir(), which happens fairly early in
> the set-up sequence.  I wonder if there is a real use case that
> creates a tmp-objdir that early in the process to require this
> unapply-reapply sequence.
> 

The lack of this code was causing a failure, I believe in
t2107-update-index-basic.sh: "--refresh triggers late setup_work_tree".

This problem came up after applying: https://lore.kernel.org/git/4a40fd4a29a468b9ce320bc7b22f19e5a526fad6.1637020263.git.gitgitgadget@xxxxxxxxx/

I thought it would be best to fix this in the tmp-objdir code so that
callers could plug/unplug bulk checkin without any subtle surprises.

> > @@ -1809,8 +1846,11 @@ int hash_object_file(const struct git_hash_algo *algo, const void *buf,
> >  /* Finalize a file on disk, and close it. */
> >  static void close_loose_object(int fd)
> >  {
> > -	if (fsync_object_files)
> > -		fsync_or_die(fd, "loose object file");
> > +	if (!the_repository->objects->odb->will_destroy) {
> > +		if (fsync_object_files)
> > +			fsync_or_die(fd, "loose object file");
> 
> OK, so we omit fsync because these newly created loose objects may
> not survive and instead get discarded.  Presumably when we migrate
> them to the real object store, we'll make sure they hit the disk
> platter in some other way?
> 
> 	... goes and cheats by reading ahead ...
> 
> Ahh, ok, new objects created in a temporary object store that is
> marked with the will_destroy bit is not allowed to migrate to the
> real object store, so there is no point to fsync them.
> 
> set_temporary_primary_odb() and tmp_objdir_replace_primary_odb() can
> mark the temporary one to be throw-away, but unfortunately there is
> no caller in this step, so it is a bit hard to see when a throw-away
> object store is useful.  I guess remerge-diff wants to do tentative
> merges that create new objects in a throw-away object directory,
> because it is logically a read-only operation.
> 

Yes, this code is there exactly for remerge-diff and anyone doing something
similar in the future.

> > diff --git a/tmp-objdir.c b/tmp-objdir.c
> > index b8d880e3626..3d38eeab66b 100644
> > --- a/tmp-objdir.c
> > +++ b/tmp-objdir.c
> > @@ -1,5 +1,6 @@
> >  #include "cache.h"
> >  #include "tmp-objdir.h"
> > +#include "chdir-notify.h"
> >  #include "dir.h"
> >  #include "sigchain.h"
> >  #include "string-list.h"
> > @@ -11,6 +12,8 @@
> >  struct tmp_objdir {
> >  	struct strbuf path;
> >  	struct strvec env;
> > +	struct object_directory *prev_odb;
> > +	int will_destroy;
> 
> The other one was a one-bit unsigned bitfield, but this is a full
> integer.  I somehow think that the other one can and should be a
> full integer, too---it's not like there are tons of bits need to be
> stored in the structure or we will have tons of instances of the
> structure that storing many bits compactly matters.
> 

The principle I was trying to follow here is that the only flag in a
structure might as well be a full integer, but when we have two or more
it might be worth combining them into a single machine word.  Given that
these are not highly replicated structures, you're right that's it's not
a big benefit.

I'll switch everything to an int and call it good.

Given that this patch series introduces functions with no users, are you
going to hold off on putting this into 'next' until another next-worthy
patch series is ready?  I've already reworked the batch mode stuff on Github,
but I'll need to do a lot more testing before sending it to the list.

Thanks,
Neeraj



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux