ab/squelch-empty-fsync-traces & hx/unpack-streaming bug (was: What's cooking in git.git (Jul 2022, #04; Wed, 13))

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 13 2022, Junio C Hamano wrote:

> * ab/squelch-empty-fsync-traces (2022-06-30) 1 commit
>  . trace2: don't include "fsync" events in all trace2 logs
>
>  Omit fsync-related trace2 entries when their values are all zero.
>
>  Breaks tests in hx/unpack-streaming with an interesting interaction.
>  source: <patch-v2-1.1-a1fc37de947-20220630T084607Z-avarab@xxxxxxxxx>

[...]

> * hx/unpack-streaming (2022-06-13) 6 commits
>   (merged to 'next' on 2022-07-08 at 4eb375ec2f)
>  + unpack-objects: use stream_loose_object() to unpack large objects
>  + core doc: modernize core.bigFileThreshold documentation
>  + object-file.c: add "stream_loose_object()" to handle large object
>  + object-file.c: factor out deflate part of write_loose_object()
>  + object-file.c: refactor write_loose_object() to several steps
>  + unpack-objects: low memory footprint for get_data() in dry_run mode
>
>  Allow large objects read from a packstream to be streamed into a
>  loose object file straight, without having to keep it in-core as a
>  whole.
>
>  Will merge to 'master'.
>  source: <cover.1654914555.git.chiyutianyi@xxxxxxxxx>

I hadn't had time to look at this until now. There's some interesting
behavior here.

The code to check the hardware flush was added in aaf81223f48
(unpack-objects: use stream_loose_object() to unpack large objects,
2022-06-11) (that series is now on master).

But as my ab/squelch-empty-fsync-traces notes we always add this to the
event, so the:

	grep fsync/hardware-flush trace2.txt &&

Is equivalent to:

	true &&

I.e. it's not testing worthwhile at all. The reason you're seeing a
failure is deu to 412e4caee38 (tests: disable fsync everywhere,
2021-10-29), i.e. our tests disable fsync(). What you have queued will
pass as:

	GIT_TEST_FSYNC=true ./t5351-unpack-large-objects.sh

But I think that would be meaningless, since we'll write out that on
FSYNC_HARDWARE_FLUSH whether we actually support "bulk" or not. AFAICT
the way to detect if we support "bulk" at all is to check for
fsync/writeout-only.

*Except* that we we unconditionally increment the "writeout only"
counter, even if we don't actually support that "bulk" mode. We're just
doing a regular fsync().

So, narrowly it looks easy to "fix" my ab/squelch-empty-fsync-traces, I
could apply this on top:

diff --git a/t/t5351-unpack-large-objects.sh b/t/t5351-unpack-large-objects.sh
index 8ce8aa3b147..29cab843eb9 100755
--- a/t/t5351-unpack-large-objects.sh
+++ b/t/t5351-unpack-large-objects.sh
@@ -53,8 +53,12 @@ BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
 test_expect_success 'unpack big object in stream (core.fsyncmethod=batch)' '
 	prepare_dest 1m &&
 	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
+	GIT_TEST_FSYNC=true \
 		git -C dest.git $BATCH_CONFIGURATION unpack-objects <pack-$PACK.pack &&
-	grep fsync/hardware-flush trace2.txt &&
+	grep fsync/ trace2.txt >wo.txt &&
+	sed -e "s/.*value\":\"//" -e "s/\".*//" <wo.txt >actual &&
+	test_write_lines 6 1 >expect &&
+	test_cmp expect actual &&
 	test_dir_is_empty dest.git/objects/pack &&
 	git -C dest.git cat-file --batch-check="%(objectname)" <obj-list >current &&
 	cmp obj-list current

But does this make any sense in the larger scheme of things?  I.e. the
trace2 logging isn't at all logging that we're actually doing with
fsync, but what we intended to do based on the application logic, is
that intended & OK or not?



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux