Re: t7527 intermittent failure on macOS APFS and possible fix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2022-08-12 11:08:30-0700, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Đoàn Trần Công Danh  <congdanhqx@xxxxxxxxx> writes:
> 
> > Running t7527 on macOS with encrypted APFS filesystem.
> > I observes intermittent failure, however, when I manually check the
> > test cases, they're all passed.
> >
> > I suspected fileystem caching issue.
> > I added those sync-s into test steps and the test pass.
> > I'm not sure if this is the intending "fix" for the tests
> > since we're testing the fsmonitor with t7527.
> >
> > Please advise!
> 
> fsmonitor experts, please do.
> 
> My gut feeling is that, unless the production code internally calls
> the equivalent of "sync" done here and the failure in tests are
> coming from the fact that the "sync" is missing in "test-tool
> fsmonitor-client" (i.e. test-tool does not emulate the real world
> closely enough and fails in cases where the machinery does not fail
> in the real world), these "sync" calls only sweep the problem under
> the rug.

It's my gut feeling, too.
Anyway, I'm running the test again, too confirm your suggestion on
t/helper/test-fsmonitor-client.c

t7527.63 (Unicode nfc/nfd) also failed intermittently, too.
Here is the content of unicode.trace:
---- 8< ----
$ cat unicode.trace
statfs('/path/to/src/git/t/trash directory.t7527-builtin-fsmonitor/test_unicode') [type 0x0000001a][flags 0x04909080] 'apfs'
statfs('/path/to/src/git/t/trash directory.t7527-builtin-fsmonitor/test_unicode') [type 0x0000001a][flags 0x04909080] 'apfs'
Watching: worktree '/path/to/src/git/t/trash directory.t7527-builtin-fsmonitor/test_unicode'
statfs('/path/to/src/git/t/trash directory.t7527-builtin-fsmonitor/test_unicode') [type 0x0000001a][flags 0x04909080] 'apfs'
requested token: quit
---------- >8 ------------------

> 
> > P/S: When debugging, I also found out that:
> > "test-tool fsmonitor-client query" doesn't write the final newline
> > character, thus making the output harder to read. The diff also have
> > the final newline added.
> >
> > ----- 8< -------
> > diff --git a/t/helper/test-fsmonitor-client.c b/t/helper/test-fsmonitor-client.c
> > index 54a4856c48..98d6cf1440 100644
> > --- a/t/helper/test-fsmonitor-client.c
> > +++ b/t/helper/test-fsmonitor-client.c
> > @@ -55,6 +55,7 @@ static int do_send_query(const char *token)
> >  
> >  	write_in_full(1, answer.buf, answer.len);
> >  	strbuf_release(&answer);
> > +	write_in_full(1, "\n", 1);
> >  
> >  	return 0;
> >  }
> > @@ -77,6 +78,7 @@ static int do_send_flush(void)
> >  
> >  	write_in_full(1, answer.buf, answer.len);
> >  	strbuf_release(&answer);
> > +	write_in_full(1, "\n", 1);
> >  
> >  	return 0;
> >  }
> 
> Aren't these protocol drivers?

The "answer" strbuf is the response from fsmonitor daemon, I think.
The write_in_full to fd 1 is test-tool writes down the answer to
stdout.

> If the protocol is defined without
> the trailing LF, would it make sense to update only the sending end
> to do this?  Or does the protocol makes it clear that a trailing LF,
> or lack of it, should be tolerated by all the implementations?
> 
> If we are absolutely sure that no implementation of the other side
> will get upset by seeing an extra LF, It would be fine, but as the
> original code wants to call write_in_full(), it would be more
> preferrable to do it this way instead, I suspect.
> 
> +	strbuf_complete(&answer, '\n');
> 	write_in_full(1, answer.buf, answer.len);
> 	strbuf_release(&answer);

This could work, since we don't send "answer" back to
fsmonitor-daemon.

-- 
Danh



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux