On 12/31/2015 11:29 AM, Eduardo Valentin wrote: > can we have a shorter title? > > On Tue, Dec 29, 2015 at 02:46:49PM +0530, Keerthy wrote: >> Hi Nishanth, >> > > <cut> >>> >>> I am not sure if this #ifdeffery is even needed. >>> >>> >>> Eduardo, Rui: If this is not the suggested technique, maybe you guys >>> could suggest how we could handle a case where userspace might be >>> hungup due to some reason and a case where a critical temperature >>> event in the middle of device probe was triggered? > > Orderly power off is supposed to take care of this. Looking at the code, > it will force a shutdown in case execution of userland command fails: > > static int __orderly_poweroff(bool force) > { > int ret; > > ret = run_cmd(poweroff_cmd); > > if (ret && force) { > pr_warn("Failed to start orderly shutdown: forcing the issue\n"); > > /* > * I guess this should try to kick off some daemon to sync and > * poweroff asap. Or not even bother syncing if we're doing an > * emergency shutdown? > */ > emergency_sync(); > kernel_power_off(); > } Yes, it will *IF* userspace fails. the condition that I had tracked was before identifying the following fix[1] - Example fail is here[2] In this case, tmp102 is setup for X15 as [3] - and built as a module. as the kernel startsup filesystem and starts a modprobe of all modules via udev rules, the probe of tmp102 detects (falsely) a critical temperature condition. Shutdown attempt in the middle of driver probe is always a tricky business. As we look at the log in [2], Line 472 > thermal thermal_zone3: critical temperature reached(108 C),shutting down We have userspace trigger for shutdown taking place. Line 495: INIT: Sending processes the TERM signal userspace starts shutting down services. (but note that probe for other devices were either in progress or queued up to complete).. at line 647 - we are in a weird place -> sysrq shows that system is idled and userspace is shutdown and system is still active. In this case, we entered the case thanks to a driver bug, but if this situation was a real world temperature scenario, then we'd probably in an overtemp scenario, then device damage could take place OR something much worse. The only alternative is to run a parallel thread in case userspace fails to complete the job in some given period of time - due to what ever be the condition triggering the problem. I hope this explains the problem. [1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=00917b5c55aeb01322d5ab51af8c025b82959224 [2] http://pastebin.ubuntu.com/14326688/ [3] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/am57xx-beagle-x15.dts#n738 -- Regards, Nishanth Menon -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html