Re: [RFC PATCH] object-file.c: batched disk flushes for stream_loose_object()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 9, 2022 at 5:30 PM Johannes Schindelin
<Johannes.Schindelin@xxxxxx> wrote:
>
> Hi,
>
> On Thu, 9 Jun 2022, Han Xin wrote:
>
> > Neeraj Singh[1] pointed out that if batch fsync is enabled, we should still
> > call prepare_loose_object_bulk_checkin() to potentially create the bulk checkin
> > objdir.
> >
> > 1. https://lore.kernel.org/git/7ba4858a-d1cc-a4eb-b6d6-4c04a5dd6ce7@xxxxxxxxx/
> >
> > Signed-off-by: Han Xin <chiyutianyi@xxxxxxxxx>
>
> I like a good commit message that is concise and yet has all the necessary
> information. Well done!
>
> > ---
> >  object-file.c                   |  3 +++
> >  t/t5351-unpack-large-objects.sh | 15 ++++++++++++++-
> >  2 files changed, 17 insertions(+), 1 deletion(-)
> >
> > diff --git a/object-file.c b/object-file.c
> > index 2dd828b45b..3a1be74775 100644
> > --- a/object-file.c
> > +++ b/object-file.c
> > @@ -2131,6 +2131,9 @@ int stream_loose_object(struct input_stream *in_stream, size_t len,
> >       char hdr[MAX_HEADER_LEN];
> >       int hdrlen;
> >
> > +     if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
> > +             prepare_loose_object_bulk_checkin();
> > +
>
> Makes sense.
>
> >       /* Since oid is not determined, save tmp file to odb path. */
> >       strbuf_addf(&filename, "%s/", get_object_directory());
> >       hdrlen = format_object_header(hdr, sizeof(hdr), OBJ_BLOB, len);
> > diff --git a/t/t5351-unpack-large-objects.sh b/t/t5351-unpack-large-objects.sh
> > index 461ca060b2..a66a51f7df 100755
> > --- a/t/t5351-unpack-large-objects.sh
> > +++ b/t/t5351-unpack-large-objects.sh
> > @@ -18,7 +18,10 @@ test_expect_success "create large objects (1.5 MB) and PACK" '
> >       test_commit --append foo big-blob &&
> >       test-tool genrandom bar 1500000 >big-blob &&
> >       test_commit --append bar big-blob &&
> > -     PACK=$(echo HEAD | git pack-objects --revs pack)
> > +     PACK=$(echo HEAD | git pack-objects --revs pack) &&
> > +     git verify-pack -v pack-$PACK.pack |
> > +         grep -E "commit|tree|blob" |
> > +             sed -n -e "s/^\([0-9a-f]*\).*/\1/p" >obj-list
>
> Here, I would recommend avoiding the pipe, to ensure that we would catch
> problems in the `verify-pack` invocation, and I think we can avoid the
> `grep` altogether:
>
>         git verify-pack -v pack-$PACK.pack >out &&
>         sed -n 's/^\([0-9a-f][0-9a-f]*\).*\(commit\|tree\|blob\)/\1/p' \
>                 <out >obj-list
>

Good suggestion. I will take it.

Thanks.
-Han Xin

> >  '
> >
> >  test_expect_success 'set memory limitation to 1MB' '
> > @@ -45,6 +48,16 @@ test_expect_success 'unpack big object in stream' '
> >       test_dir_is_empty dest.git/objects/pack
> >  '
> >
> > +BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
> > +
> > +test_expect_success 'unpack big object in stream (core.fsyncmethod=batch)' '
> > +     prepare_dest 1m &&
> > +     git $BATCH_CONFIGURATION -C dest.git unpack-objects <pack-$PACK.pack &&
>
> I think the canonical way would be to use `test_config core.fsync ...`,
> but the presented way works, too.
>
> > +     test_dir_is_empty dest.git/objects/pack &&
> > +     git -C dest.git cat-file --batch-check="%(objectname)" <obj-list >current &&
>
> Good. The `--batch-check="%(objectname)"` part forces `cat-file` to read
> the actual object.
>
> > +     cmp obj-list current
> > +'
>
> My main question about this test case is whether it _actually_ verifies
> that the batch-mode `fsync()`ing took place.
>
> I kind of had expected to see Trace2 enabled and a `grep` for
> `fsync/hardware-flush`. Do you think that would still make sense to add?
>
> Thank you for working on the `fsync()` aspects of Git!
> Dscho
>

More rigorous inspection should be adopted.

Thanks.
-Han Xin

> > +
> >  test_expect_success 'do not unpack existing large objects' '
> >       prepare_dest 1m &&
> >       git -C dest.git index-pack --stdin <pack-$PACK.pack &&
> > --
> > 2.36.1
> >
> >



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux