On 1/31/23 8:10 AM, David Hildenbrand wrote: > On 31.01.23 16:04, Jens Axboe wrote: >> On 1/31/23 8:02?AM, David Hildenbrand wrote: >>> On 31.01.23 15:50, Jens Axboe wrote: >>>> On 1/31/23 6:48?AM, David Hildenbrand wrote: >>>>> On 31.01.23 14:41, David Howells wrote: >>>>>> David Hildenbrand <david@xxxxxxxxxx> wrote: >>>>>> >>>>>>>>> percpu counters maybe - add them up at the point of viewing? >>>>>>>> They are percpu, see my last email. But for every 108 changes (on >>>>>>>> my system), they will do two atomic_long_adds(). So not very >>>>>>>> useful for anything but low frequency modifications. >>>>>>>> >>>>>>> >>>>>>> Can we just treat the whole acquired/released accounting as a debug mechanism >>>>>>> to detect missing releases and do it only for debug kernels? >>>>>>> >>>>>>> >>>>>>> The pcpu counter is an s8, so we have to flush on a regular basis and cannot >>>>>>> really defer it any longer ... but I'm curious if it would be of any help to >>>>>>> only have a single PINNED counter that goes into both directions (inc/dec on >>>>>>> pin/release), to reduce the flushing. >>>>>>> >>>>>>> Of course, once we pin/release more than ~108 pages in one go or we switch >>>>>>> CPUs frequently it won't be that much of a help ... >>>>>> >>>>>> What are the stats actually used for? Is it just debugging, or do we actually >>>>>> have users for them (control groups spring to mind)? >>>>> >>>>> As it's really just "how many pinning events" vs. "how many unpinning >>>>> events", I assume it's only for debugging. >>>>> >>>>> For example, if you pin the same page twice it would not get accounted >>>>> as "a single page is pinned". >>>> >>>> How about something like the below then? I can send it out as a real >>>> patch, will run a sanity check on it first but would be surprised if >>>> this doesn't fix it. >>>> >>>> >>>> diff --git a/mm/gup.c b/mm/gup.c >>>> index f45a3a5be53a..41abb16286ec 100644 >>>> --- a/mm/gup.c >>>> +++ b/mm/gup.c >>>> @@ -168,7 +168,9 @@ struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags) >>>> */ >>>> smp_mb__after_atomic(); >>>> +#ifdef CONFIG_DEBUG_VM >>>> node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, refs); >>>> +#endif >>>> return folio; >>>> } >>>> @@ -180,7 +182,9 @@ struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags) >>>> static void gup_put_folio(struct folio *folio, int refs, unsigned int flags) >>>> { >>>> if (flags & FOLL_PIN) { >>>> +#ifdef CONFIG_DEBUG_VM >>>> node_stat_mod_folio(folio, NR_FOLL_PIN_RELEASED, refs); >>>> +#endif >>>> if (folio_test_large(folio)) >>>> atomic_sub(refs, folio_pincount_ptr(folio)); >>>> else >>>> @@ -236,8 +240,9 @@ int __must_check try_grab_page(struct page *page, unsigned int flags) >>>> } else { >>>> folio_ref_add(folio, GUP_PIN_COUNTING_BIAS); >>>> } >>>> - >>>> +#ifdef CONFIG_DEBUG_VM >>>> node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, 1); >>>> +#endif >>>> } >>>> return 0; >>>> >>> >>> We might want to hide the counters completely by defining them only >>> with CONFIG_DEBUG_VM. >> >> Are all of them debug aids only? If so, yes we should just have >> node_stat_* under CONFIG_DEBUG_VM. >> > > Rather only these 2. Smth like: Ah gotcha, that makes more sense to me. Will update the patch and send it out. -- Jens Axboe