On Wed, Jan 08, 2025 at 11:22:42AM +0900, Koichiro Den wrote: > On Tue, Jan 07, 2025 at 11:00:10AM +0000, Lorenzo Stoakes wrote: > > On Tue, Jan 07, 2025 at 06:29:03PM +0800, Huacai Chen wrote: > > > Hi, Lorenzo, > > > > > > On Tue, Jan 7, 2025 at 4:48 PM Lorenzo Stoakes > > > <lorenzo.stoakes@xxxxxxxxxx> wrote: > > > > > > > > On Tue, Jan 07, 2025 at 09:18:48AM +0800, Huacai Chen wrote: > > > > > Hi, all, I like simple fixes, so is this acceptable (based on an early > > > > > version from Koichiro Den)? > > > > > > > > No not at all. This is bizarre - in the mail you are replying to Koichiro > > > > agrees with me that the approach of his code that you've sent here (I don't > > > > know why) is fundamentally broken and suggest another. > > > > > > > > I am at a loss as to why you've sent this? Perhaps a miscommunication > > > > somewhere? :) > > > > > > > > In any case, please don't send '[PATCH] xxx' mails that aren't intended to > > > > be patches, a better form of this would be to say 'oh can we just do...' > > > > then to put this code in the mail underneath, without any '[PATCH]' prefix. > > > I wasn't in the CC list, and I also found the bug yesterday, so I can > > > only reply to this email with "git send-email --in-reply-to". This is > > > the reason why my reply looks so stranne. > > > > > > > > > > > But please do review the discussion - the below is insufficient as simple > > > > as it is (sadly) because the boot CPU's delayed work will never be > > > > executed. > > > Koichiro's simple fix causes the boot CPU's delayed work to never be > > > executed, this is obvious, and of course I have read the early > > > discussion. And so I improve it, with a "cpu_online()" checking, then > > > the boot CPU is unaffected. > > > > With respect Haucai, this is not how you engage in kernel discussion. You > > could simply have replied to the mail or given more information, you now > > have two people telling you this, please take it on board. > > > > And it caused me to miss your actually quite valuable suggestion so this is > > beneficial for all! :) > > > > ANYWAY, moving on to the technical side: > > > > > > > > Huacai > > > > > > > > > > > I will take a look at Koichiro's new approach as soon as I am able. > > > > > > > > Cheers! > > > > > > > > > --- > > > > > mm/vmstat.c | 7 ++++++- > > > > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/mm/vmstat.c b/mm/vmstat.c > > > > > index 0889b75cef14..1badc24a21ff 100644 > > > > > --- a/mm/vmstat.c > > > > > +++ b/mm/vmstat.c > > > > > @@ -2122,10 +2122,15 @@ static void __init start_shepherd_timer(void) > > > > > { > > > > > int cpu; > > > > > > > > > > - for_each_possible_cpu(cpu) > > > > > + for_each_possible_cpu(cpu) { > > > > > INIT_DEFERRABLE_WORK(per_cpu_ptr(&vmstat_work, cpu), > > > > > vmstat_update); > > > > > > > > > > + /* Will be enabled on vmstat_cpu_online() */ > > > > > + if (!cpu_online(cpu)) > > > > This might actually be workable as something simpler, on assumption there > > Sorry about late response. And thank you very much, Huacai. Your suggestion > looks great. Much simpler and intuitive. > > > can be no race here (I don't think so right?). > > IIUC, there is no race since smp_init() always runs before init_mm_internals(). Oops, critical typo. s/before/after/. > At this stage (i.e. start_shepherd_timer()), for cpuhp scenario, the secondary > CPUs haven’t even been brought up yet. So the simple 'cpu_online' check looks > simple and the best. > > > > > Koichiro - could you check this and see whether it resolves the issue and > > whether you feel this is sensible? > > I tested it and it seems to work well both on booting and cpuhp events. > > (By the way, in my previous email comment [1], I forgot to mention that I also > applied other unmerged patches, [2] and [3], just to be able to run such random > transisioning test. This time here, I just tested by switching CPUHP_ONLINE <-> > CPUHP_OFFLINE only.) > > [1]: https://lore.kernel.org/linux-mm/7ed97096-859e-46d0-8f27-16a2298a8914@lucifer.local/T/#m11d983715699d3cea295b8618aba7b6ccec4db55 > [2]: https://lore.kernel.org/all/20241220134421.3809834-1-koichiro.den@xxxxxxxxxxxxx/ > [3]: https://lore.kernel.org/all/20241220141538.4018232-1-koichiro.den@xxxxxxxxxxxxx/ > > > > > Might be worth expanding comment to say that we disable on offline, enable > > on online and we're just providing symmetry here. > > Sounds good. Like this? > > - for_each_possible_cpu(cpu) > + for_each_possible_cpu(cpu) { > INIT_DEFERRABLE_WORK(per_cpu_ptr(&vmstat_work, cpu), > vmstat_update); > > + /* > + * For secondary CPUs during CPU hotplug scenarios, > + * vmstat_cpu_online() will enable the work. > + * mm/vmstat:online enables and disables vmstat_work > + * symmetrically during CPU hotplug events. > + */ > + if (!cpu_online(cpu)) > + disable_delayed_work_sync(&per_cpu(vmstat_work, cpu)); > + } > > > Before submitting a revised patch, I'd like to confirm: > Huacai, would you be comfortable with me adding your Signed-off-by to the > commit, or would you prefer a Suggested-by tag instead? > > Thanks. > > -Koichiro Den > > > > > > > > + disable_delayed_work_sync(&per_cpu(vmstat_work, cpu)); > > > > > + } > > > > > + > > > > > schedule_delayed_work(&shepherd, > > > > > round_jiffies_relative(sysctl_stat_interval)); > > > > > } > > > > > -- > > > > > 2.43.5 > > > > > > > > > > > > Cheers, Lorenzo >