Re: [RFC] Introducing different handling for small/large transactions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/15/2015 11:36 PM, Stefan Beller wrote:
> For everyday use we want git to be fast. Creating one commit should not
> touch the packed refs file. If we do other stuff involving more than
> one ref, we may accept touching the packed refs file and have a process
> which takes slightly longer but can handle more complex requests correctly,
> such as renaming into and from directories (topic/1 -> topic and reverse).
> Renaming is currently not part of the transaction API because of the (D/F)
> problems. This proposed change would enable having renames being part of
> the transactions API.
> 
> A transaction covers creating, deleting and updating a ref and its reflog.
> Renaming would be a deletion followed by creating a new ref atomically.

A rename is a little bit more than a generic delete+create pair; it also
moves the reflog from the old name to the new name. Is your plan to add
an extra "rename" operation to the refs-transactions API, or to
automatically detect delete+create pairs and treat them as renames?

> So for here is my proposal for small transactions:
> (just one ref [and/or reflog] touched):
> 
> In ref_transaction_update:
> 	* create $REF.lock file
> 	* write new content to the lock file
> 
> In ref_transaction_commit
> 	* commit the .lock file to its destination
> 	* in case this is a deletion:
> 		* remove the loose ref
> 		* and repack the packed refs file if necessary

The above describes the current algorithm, but FYI it is not entirely
correct. The deletion of the loose ref might expose an old version of
the reference in the packed-refs file (which might even point at an
object that has been garbage-collected. So the reference has to be
deleted from the packed-refs file before the loose ref version is deleted.

However, it is important that the packed-ref lock be held during the
whole procedure, so that a pack-refs process doesn't rewrite the loose
ref version of the reference into the (now-unlocked) packed-refs file,
causing the reference to survive its supposed deletion. (At least that
was the status a while ago; I don't know if recent changes to pack-refs
might have removed this problem in another way.)

But activating a new packed-refs file while still holding the
packed-refs lock is not supported by our current lockfile API. In fact,
working towards enabling this was one of the reasons for the big
lockfile refactoring that I did a while back. Though I never got as far
as fixing this bug.

> The larger transactions would be handled differently by relying
> on the packed refs file:
> In ref_transaction_update:
> 	* detect if we transition to a large transaction
> 	  (by having more than one entry in transaction->updates)
> 	  if so:
> 		* Pack all currently existing refs into the packed
> 		  refs file, commit the packed refs file and delete
> 		  all loose refs. This will avoid (d/f) conflicts.
> 
> 		* Keep the packed-refs file locked and move the first
> 		  transaction update into the packed-refs.lock file

NB: this requires not just one but two rewrites of the packed-refs file,
sharpening the performance concerns expressed elsewhere in this thread.

But couldn't one of the rewrites be avoided if the transaction doesn't
involve any deletes?

> 	* Any update(delete, create, update) is put into the locked
> 	  packed refs file.
> 	* Additionally we need to obtain the .lock for the loose refs
> 	  file to keep guarantees, though we should close the file
> 	  descriptor as we don't wand to run out of file descriptors.
> 
> In ref_transaction_commit:
> 	* We only need to commit the packed refs file
> 	* Discard all other lock files as the changes get committed as a whole
> 	  by the packed refs file

Michael

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]