suspected race between packing and fetch (single case study)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,


colleagues encouraged me to report a "personal" bug I've stumbled
across. Its "personal", because I wasn't able to create a minimal
reproducer, or even reproduce it with the same script on other
infrastructure. We're suspecting a race between packing and fetch. The
script I am using is at the bottom of the email.

The script creates a joint Git/git-annex repository A with a large
number of objects. Afterwards, a repository B is created, and A is
cloned into it.
Cloning fails initially. Errors look like this:

+ git clone --progress ../A /tmp/B/subds
Cloning into '/tmp/B/subds'...
fatal: failed to copy file to
'/tmp/B/subds/.git/objects/44/93d6041a44b5a7280875ec9b6ecd78fbab7b6e':
No such file or directory

Running "ps aux -H | grep git" before and after cloning shows garbage
collection and packing processes in repo A. We're suspecting that there
is a race. Here is script output that shows the processes:
+ cd B
+ ps aux -H
+ grep git
adina     674763  0.0  0.0   6152   836 pts/5    S+   16:38
0:00           grep git
adina     674071  0.0  0.0   9584  2788 ?        Ss   16:38   0:00
/usr/lib/git-core/git gc --auto --no-quiet
adina     674072  0.0  0.0   9584  3884 ?        S    16:38   0:00
/usr/lib/git-core/git repack -d -l --no-write-bitmap-index
adina     674073  149  0.1 583780 20564 ?        R    16:38
0:02         /usr/lib/git-core/git pack-objects --local
--delta-base-offset .git/objects/pack/.tmp-674072-pack
--keep-true-parents --honor-pack-keep --non-empty --all --reflog
--indexed-objects --unpacked  --incremental
+ git clone --progress ../A /tmp/B/subds
Cloning into '/tmp/B/subds'...
fatal: failed to copy file to
'/tmp/B/subds/.git/objects/14/5a4c6775684788ecf51e5d745ac19ad5b204e3':
No such file or directory
+ ps aux -H
+ grep git
adina     674774  0.0  0.0   6152   896 pts/5    S+   16:38
0:00           grep git
adina     674071  0.0  0.0   9584  2788 ?        Ss   16:38   0:00
/usr/lib/git-core/git gc --auto --no-quiet
adina     674072 11.0  0.0  11160  3884 ?        R    16:38   0:00
/usr/lib/git-core/git repack -d -l --no-write-bitmap-index
bash script.sh  65.71s user 29.53s system 94% cpu 1:40.71 total



Both A and B are completely sane repositories, git fsck shows nothing
out of the ordinary, I can clone them fine in any situation but the
scripted workflow. If I add a short "sleep" between creating A and
cloning A into B the error vanishes.

I have been able to trigger this reliably for a month with the script. I
am running git version 2.29.2 (but also saw this when downgrading to
version 2.24) on Debian testing (bullseye). Other than simply waiting a
bit before the clone, setting git config --global gc.autodetach false
removes the bug, too.

I wonder if there is a way that Git could guard cases where background
gc processes may still be running?


For completeness, here is the script I am using to trigger this on my
machine. We didn't manage to reproduce the behavior on another machine,
and I didn't find a more minimal example (sorry :( ). The script
involves datalad (which uses git-annex):

#!/bin/sh

set -x

# this creates a joint git/git-annex repository
datalad create A && cd A
# this adds adds and extracts a tarball with ~13.000 JPEGs to the
repository. Data is added to git annex.
datalad download-url \
     --archive \
     --message "Download Imagenette dataset" \
     'https://s3.amazonaws.com/fast-ai-imageclas/imagenette2-160.tgz'
# this creates another joint git/git-annex repository
cd ../ && datalad create B
cd B
ps aux -H | grep git
git clone --progress ../A /tmp/B/subds
ps aux -H | grep git


Kind regards,
Adina



------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux