On Tue, 2020-05-12 at 18:22 +0200, Dmitry Vyukov wrote: > On Tue, May 12, 2020 at 6:14 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > > > > > > > This feature will record first and last call_rcu() call stack and > > > > > > > > print two call_rcu() call stack in KASAN report. > > > > > > > > > > > > > > Suppose that a given rcu_head structure is passed to call_rcu(), then > > > > > > > the grace period elapses, the callback is invoked, and the enclosing > > > > > > > data structure is freed. But then that same region of memory is > > > > > > > immediately reallocated as the same type of structure and again > > > > > > > passed to call_rcu(), and that this cycle repeats several times. > > > > > > > > > > > > > > Would the first call stack forever be associated with the first > > > > > > > call_rcu() in this series? If so, wouldn't the last two usually > > > > > > > be the most useful? Or am I unclear on the use case? > > > > > > > > > > 2 points here: > > > > > > > > > > 1. With KASAN the object won't be immediately reallocated. KASAN has > > > > > 'quarantine' to delay reuse of heap objects. It is assumed that the > > > > > object is still in quarantine when we detect a use-after-free. In such > > > > > a case we will have proper call_rcu stacks as well. > > > > > It is possible that the object is not in quarantine already and was > > > > > reused several times (quarantine is not infinite), but then KASAN will > > > > > report non-sense stacks for allocation/free as well. So wrong call_rcu > > > > > stacks are less of a problem in such cases. > > > > > > > > > > 2. We would like to memorize 2 last call_rcu stacks regardless, but we > > > > > just don't have a good place for the index (bit which of the 2 is the > > > > > one to overwrite). Probably could shove it into some existing field, > > > > > but then will require atomic operations, etc. > > > > > > > > > > Nobody knows how well/bad it will work. I think we need to get the > > > > > first version in, deploy on syzbot, accumulate some base of example > > > > > reports and iterate from there. > > > > > > > > If I understood the stack-index point below, why not just move the > > > > previous stackm index to clobber the previous-to-previous stack index, > > > > then put the current stack index into the spot thus opened up? > > > > > > We don't have any index in this change (don't have memory for such index). > > > The pseudo code is" > > > > > > u32 aux_stacks[2]; // = {0,0} > > > > > > if (aux_stacks[0] != 0) > > > aux_stacks[0] = stack; > > > else > > > aux_stacks[1] = stack; > > > > I was thinking in terms of something like this: > > > > u32 aux_stacks[2]; // = {0,0} > > > > if (aux_stacks[0] != 0) { > > aux_stacks[0] = stack; > > } else { > > if (aux_stacks[1]) > > aux_stacks[0] = aux_stacks[1]; > > aux_stacks[1] = stack; > > } > > > > Whether this actually makes sense in real life, I have no idea. > > The theory is that you want the last two stacks. However, if these > > elements get cleared at kfree() time, then I could easily believe that > > the approach you already have (first and last) is the way to go. > > > > Just asking the question, not arguing for a change! > > Oh, this is so obvious... in hindsight! :) > > Walter, what do you think? > u32 aux_stacks[2]; // = {0,0} if (aux_stacks[0] != 0) { aux_stacks[0] = stack; } else { if (aux_stacks[1]) aux_stacks[0] = aux_stacks[1]; aux_stacks[1] = stack; } Hmm...why I think it will always cover aux_stacks[0] after aux_stacks[0] has stack, it should not record last two stacks? How about this: u32 aux_stacks[2]; // = {0,0} if (aux_stacks[1]) aux_stacks[0] = aux_stacks[1]; aux_stacks[1] = stack; > I would do this. I think latter stacks are generally more interesting > wrt shedding light on a bug. The first stack may even be "statically > known" (e.g. if object is always queued into a workqueue for some lazy > initialization during construction). I think it make more sense to record latter stack, too. Thanks for your and Paul's suggestion.