cc'ing Andrew On Mon, Apr 27, 2015 at 04:27:16PM -0400, Chris Metcalf wrote: > I've been out on vacation the last ten days, but picking this up > again now. > > I'll wait a bit before putting out a v10, and also address Uli's additional > emails. Meanwhile, who is the right person to eventually pick up this patchset > and push it up to Linus? Frederic, Don, Thomas, akpm? v9 is here: I usually resubmit watchdog changes with my signoff to Andrew. But would just my ACK be ok, Andrew? Cheers, Don > > https://lkml.org/lkml/2015/4/17/697 > > And I haven't heard any feedback on my fix to /proc/self/stat etc. to > avoid showing the PARKED threads in "R" state (patch 3/3 from that series). > > Thanks for any guidance. > > > On 04/22/2015 11:21 AM, Don Zickus wrote: > >On Wed, Apr 22, 2015 at 07:02:31AM -0400, Ulrich Obergfell wrote: > >>Chris, > >> > >>in https://lkml.org/lkml/2015/4/17/616 you stated: > >> > >> ">> + alloc_cpumask_var(&watchdog_cpumask_for_smpboot, GFP_KERNEL); > >> > > >> > alloc_cpumask_var could fail? > >> > >> Good catch; if I get a failure I'll just return early without trying to > >> start the watchdog, since clearly things are too memory-constrained > >> to enable that functionality anyway." > >> > >>Let's assume that (in spite of the memory constraints) the kernel would still > >>be able to make progress and get to a point where the system will be usable. > >>In this corner case, the following code would leave a NULL pointer behind in > >>watchdog_cpumask and in watchdog_cpumask_bits which could subsequently lead > >>to a crash. > >> > >> void __init lockup_detector_init(void) > >> { > >> set_sample_period(); > >>+ if (!alloc_cpumask_var(&watchdog_cpumask, GFP_KERNEL)) { > >>+ pr_err("Failed to allocate cpumask for watchdog"); > >>+ return; > >>+ } > >>+ watchdog_cpumask_bits = cpumask_bits(watchdog_cpumask); > >> > >>For example, proc_watchdog_cpumask() and the change that your patch introduces > >>in watchdog_enable_all_cpus() are not protected against a possible NULL pointer. > >>I think the code needs to be made safer. > >Or we could just statically allocate it > > > >static DECLARE_BITMAP(watchdog_cpumask, NR_CPUS) __read_mostly; > > > >Cheers, > >Don > > I think Don's suggestion is best here. It's too intrusive to try to check > for the out-of-memory condition everywhere in the code, just to guard > against the possibility that a system that is already out of memory while > starting the watchdog still has users trying to fiddle with the > /proc/sys/kernel/watchdog* knobs. > > The diff against v9 is just this (plus changing watchdog_cpumask to > &watchdog_cpumask in a bunch of places): > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 8875717b6616..ec742f38c90d 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -57,8 +57,8 @@ int __read_mostly sysctl_softlockup_all_cpu_backtrace; > #else > #define sysctl_softlockup_all_cpu_backtrace 0 > #endif > -static cpumask_var_t watchdog_cpumask; > -unsigned long *watchdog_cpumask_bits; > +static struct cpumask __read_mostly; > +unsigned long *watchdog_cpumask_bits = cpumask_bits(watchdog_cpumask); > /* Helper for online, unparked cpus. */ > #define for_each_watchdog_cpu(cpu) \ > @@ -913,12 +913,6 @@ void __init lockup_detector_init(void) > { > set_sample_period(); > - if (!alloc_cpumask_var(&watchdog_cpumask, GFP_KERNEL)) { > - pr_err("Failed to allocate cpumask for watchdog"); > - return; > - } > - watchdog_cpumask_bits = cpumask_bits(watchdog_cpumask); > - > #ifdef CONFIG_NO_HZ_FULL > if (!cpumask_empty(tick_nohz_full_mask)) > pr_info("Disabling watchdog on nohz_full cores by default\n"); > > That said, presumably we need to schedule a cage match between Frederic and Don > to decide on whether it's best to statically allocate cpumasks or not :-) > > https://lkml.org/lkml/2015/4/16/416 > > My sense is that in this case it's appropriate, since it's much harder to > manage the failure case, whereas in the earlier discussion for > smpboot_update_cpumask_percpu_thread() it made sense to just give up and > return a quick ENOMEM. Also, in this case we have no locking issues. > -- > Chris Metcalf, EZChip Semiconductor > http://www.ezchip.com > -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html