Re: [PATCH] usage: trace2 BUG() invocations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2/5/21 7:51 AM, Derrick Stolee wrote:
On 2/5/2021 4:01 AM, Jeff King wrote:
On Thu, Feb 04, 2021 at 10:17:29PM -0800, Junio C Hamano wrote:

Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes:

die() messages are traced in trace2, but BUG() messages are not. Anyone
tracking die() messages would have even more reason to track BUG().
Therefore, write to trace2 when BUG() is invoked.

Signed-off-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx>
---
This was noticed when we observed at $DAYJOB that a certain BUG()
invocation [1] wasn't written to traces.

[1] https://lore.kernel.org/git/YBn3fxFe978Up5Ly@xxxxxxxxxx/
---
  t/helper/test-trace2.c   |  9 +++++++++
  t/t0210-trace2-normal.sh | 19 +++++++++++++++++++
  usage.c                  |  6 ++++++
  3 files changed, 34 insertions(+)

Sounds like a good idea.  Expert opinions?

I like the overall idea, but it does open the possibility of a BUG() in
the trace2 code looping infinitely.

I also like the idea. This infinite loop is scary.

We've had a similar problem on the die() side in the past, and solved it
with a recursion flag. But note it gets a bit non-trivial in the face of
threads. There's some discussion in 1ece66bc9e (run-command: use
thread-aware die_is_recursing routine, 2013-04-16).

That commit talks about a case where "die()" in a thread takes down the
thread but not the whole process. That wouldn't be true here (we'd
expect BUG() to take everything down). So a single counter might be OK
in practice, though I suspect we could trigger the problem racily
Likewise this is probably a lurking problem when other threaded code
calls die(), but we just don't do that often enough for anybody to have
noticed.

Would a simple "BUG() has been called" static suffice?

diff --git a/usage.c b/usage.c
index 1868a24f7a..0d2408f79e 100644
--- a/usage.c
+++ b/usage.c
@@ -265,7 +265,11 @@ int BUG_exit_code;
static NORETURN void BUG_vfl(const char *file, int line, const char *fmt, va_list params)
  {
+       static int in_bug = 0;
         char prefix[256];
+       if (in_bug)
+               abort();
+       in_bug = 1;
/* truncation via snprintf is OK here */
         if (file)

Note that the NOTRETURN means we can't no-op with something like

	if (in_bug)
		return;

so the trace2 call would want to be as close to the abort as
possible to avoid a silent failure. So, in the patch...

diff --git a/usage.c b/usage.c
index 1868a24f7a..16272c5348 100644
--- a/usage.c
+++ b/usage.c
@@ -273,6 +273,12 @@ static NORETURN void BUG_vfl(const char *file, int line, const char *fmt, va_lis
  	else
  		snprintf(prefix, sizeof(prefix), "BUG: ");
+ /*
+	 * We call this trace2 function first and expect it to va_copy 'params'
+	 * before using it (because an 'ap' can only be walked once).
+	 */
+	trace2_cmd_error_va(fmt, params);
+
  	vreportf(prefix, fmt, params);

We would want this vreportf() to be before the call to
trace2_cmd_error_va(), right?

There's a subtle quirk in the va_list stuff that we can only traverse
the list once.  So my trace2_ routines always do a `va_copy` and iterate
on it.  This leaves the original `params` untouched when it is handed to
`vreportf()`.

If we want to reorder the output, we'd need to va_copy it first.
(Or teach vreporf to always va_copy its arg.)


Thanks,
-Stolee




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux