On Thu, Sep 07, 2023 at 08:11:48AM +0100, Lorenzo Stoakes wrote: > On Wed, Sep 06, 2023 at 10:46:08PM +0000, Joel Fernandes wrote: > > On Wed, Sep 06, 2023 at 08:23:18PM +0100, Lorenzo Stoakes wrote: > > > On Tue, 5 Sept 2023 at 12:47, Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote: > > > > > > > > On Tue, Sep 05, 2023 at 08:09:16AM +0100, Lorenzo Stoakes wrote: > > > > > On Mon, Sep 04, 2023 at 06:08:04PM +0000, Joel Fernandes (Google) wrote: > > > > > > It is unsafe to dump vmalloc area information when trying to do so from > > > > > > some contexts. Add a safer trylock version of the same function to do a > > > > > > best-effort VMA finding and use it from vmalloc_dump_obj(). > > > > > > > > > > It'd be nice to have more details as to precisely which contexts and what this > > > > > resolves. > > > > > > > > True. I was hoping the 'trylock' mention would be sufficient (example hardirq > > > > context interrupting a lock-held region) but you're right. > > > > > > > > > > [applied test robot feedback on unused function fix.] > > > > > > [applied Uladzislau feedback on locking.] > > > > > > > > > > > > Reported-by: Zhen Lei <thunder.leizhen@xxxxxxxxxxxxxxx> > > > > > > Cc: Paul E. McKenney <paulmck@xxxxxxxxxx> > > > > > > Cc: rcu@xxxxxxxxxxxxxxx > > > > > > Cc: Zqiang <qiang.zhang1211@xxxxxxxxx> > > > > > > Reviewed-by: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx> > > > > > > Fixes: 98f180837a89 ("mm: Make mem_dump_obj() handle vmalloc() memory") > > > > > > Cc: stable@xxxxxxxxxxxxxxx > > > > > > Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> > > > > > > --- > > > > > > mm/vmalloc.c | 26 ++++++++++++++++++++++---- > > > > > > 1 file changed, 22 insertions(+), 4 deletions(-) > > > > > > > > > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > > > > > index 93cf99aba335..2c6a0e2ff404 100644 > > > > > > --- a/mm/vmalloc.c > > > > > > +++ b/mm/vmalloc.c > > > > > > @@ -4274,14 +4274,32 @@ void pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms) > > > > > > #ifdef CONFIG_PRINTK > > > > > > bool vmalloc_dump_obj(void *object) > > > > > > { > > > > > > - struct vm_struct *vm; > > > > > > void *objp = (void *)PAGE_ALIGN((unsigned long)object); > > > > > > + const void *caller; > > > > > > + struct vm_struct *vm; > > > > > > + struct vmap_area *va; > > > > > > + unsigned long addr; > > > > > > + unsigned int nr_pages; > > > > > > > > > > > > - vm = find_vm_area(objp); > > > > > > - if (!vm) > > > > > > + if (!spin_trylock(&vmap_area_lock)) > > > > > > + return false; > > > > > > > > > > It'd be good to have a comment here explaining why we must trylock here. I am > > > > > also concerned that in the past this function would return false only if the > > > > > address was not a vmalloc one, but now it might just return false due to lock > > > > > contention and the user has no idea which it is? > > > > > > > > > > I'd want to at least output "vmalloc region cannot lookup lock contention" > > > > > vs. the below cannot find case. > > > > > > > > In the patch 2/2 we do print if the address looks like a vmalloc address even > > > > if the vmalloc look up fails. > > > > > > No, you output exactly what was output before, only changing what it > > > means and in no way differentiating between couldn't find vmalloc > > > area/couldn't get lock. > > > > 2/2 does this: > > - if (virt_addr_valid(object)) > > + if (is_vmalloc_addr(object)) > > + type = "vmalloc memory"; > > + else if (virt_addr_valid(object)) > > type = "non-slab/vmalloc memory"; > > > > This code is executed only if vmalloc_dump_obj() returns false. The > > is_vmalloc_addr() was added by 2/2 which is newly added right? > > > > You are right we are not differentiating between trylock failure and failure to > > find the vmalloc area. I was just saying, even though we don't differentiate, > > we do print "vmalloc memory" right? That wasn't being printed before. > > > > > > Also the reporter's usecase is not a common one. We only attempt to dump > > > > information if there was a debug objects failure (example if somebody did a > > > > double call_rcu). In such a situation, the patch will prevent a deadlock and > > > > still print something about the address. > > > > > > Right, but the function still purports to do X but does Y. > > > > > > > > > > > > Under heavy lock contention aren't you potentially breaking the ability to > > > > > introspect vmalloc addresses? Wouldn't it be better to explicitly detect the > > > > > contexts under which acquiring this spinlock is not appropriate? > > > > > > > > Yes this is a good point, but there's another case as well: PREEMPT_RT can > > > > sleep on lock contention (as spinlocks are sleeping) and we can't sleep from > > > > call_rcu() as it may be called in contexts that cannot sleep. So we handle > > > > that also using trylock. > > > > > > Right so somebody now has to find this email to realise that. I hate > > > implicit knowledge like this, it needs a comment. It also furthers the > > > point that it'd be useful to differentiate between the two. > > > > This is a valid point, and I acknowledged it in last email. A code comment could > > indeed be useful. > > Thanks, yeah this may seem trivial, but I am quite sensitive about things > being added to the code base that are neither described in commit msg nor > in a comment or elsewhere and become 'implicit' in a sense. > > So just a simple comment here would be helpful, and I'm glad we're in > agreement on that, will leave to you to do a follow up patch. > > > > > So I guess from an agreement standpoint, I agree: > > > > 1/2 could use an additional comment explaining why we need trylock (sighting > > the RT sleeping lock issue). > > > > 2/2 could update the existing code to convert "non-slab/vmalloc" to > > "non-slab/non-vmalloc". Note: that's an *existing* issue. > > Yeah sorry this whole thing was rather confusing, it did indeed (unclearly) > specify non-/non- in the past (on assumption dumping function would work), > addition of vmalloc check now makes that correct again, the phrasing is the > issue. > > You can leave this as-is as yeah, you're right, this was a pre-existing issue. > > virt_addr_valid() returns true for a slab addr, but kmem_valid_obj() is > checked above so already been ruled out, now you ruled out vmalloc. > > Just a bit tricksy. > > > > > The issue in 2/2 is not a new one so that can certainly be a separate patch. > > And while at it, we could update the comment in that patch as well. > > > > But the whole differentiating between trylock vs vmalloc area lookup failure > > is not that useful -- just my opinion fwiw! I honestly feel differentiating > > between trylock vs vmalloc area lookup failure complicates the code because > > it will require passing this information down from vmalloc_dump_obj() to the > > caller AFAICS and I am not sure if the person reading the debug will really > > care much. But I am OK with whatever the -mm community wants and I am happy > > to send out a new patch on top with the above that I agree on since Andrew > > took these 2 (but for the stuff I don't agree, I would appreciate if you > > could send a patch for review and I am happy to review it!). > > Ah right, I think maybe I wasn't clear, all I meant to suggest is to output > log output rather than feed anything back to caller, something like:- > > if (!spin_trylock(&vmap_area_lock)) { > pr_cont(" [couldn't acquire vmap lock]\n"); > ... > } > > My concern is that in the past this function would only return false if it > couldn't find the address in a VA, now it returns false also if you happen > to call it when the spinlock is locked, which might be confusing for > somebody debugging this. > > HOWEVER, since you now indicate that the address is vmalloc anyway, and you > _absolutely cannot_ give any further details safely, perhaps this > additional information is indeed not that usful. > > My concern was just feeling antsy that we suddenly don't do something > because a lock happens to be applied but as you say that cannot be helped > in certain contexts. > > So actually, leave this. > > > > > As you mentioned, this series is a stability fix and we can put touch-ups on > > top of it if needed, and there is also plenty of time till the next merge > > window. Allow me a few days and I'll do the new patch on top (I'd say dont > > bother to spend your time on it, I'll do it). > > Ack, I was just a little frustrated we didn't reach a resolution on review > (either deciding things could be deferred or having changes) before > merge. Obviously fine to prioritise, but would be good to have that > explicitly stated. > > > > > thanks, > > > > - Joel > > > > > > > > > > > > Anyway, so TL;DR:- > > 1. As we both agree, add a comment to explain why you need the spin trylock. > (there are no further steps :P) > > And I don't believe this actually needs any further changes after this > discussion*, so if you fancy doing a follow up to that effect that will > suffice for me thanks! > For PREEMPT_RT kernels we are not allowed to use "vmap parts" in non slepable context, this is just reality, because we use a sleep type of spinlock. I am not sure how urgent we need this fix. But to me it looks like debuging and corner case. Probably i am wrong and miss something. But if it is correct, i would just bailout for RT kernel and rework later in a more proper way. For example implement a safe way of RCU scan but this is also another story. -- Uladzislau Rezki