On Mon, 30 May 2022 at 18:55, H. Nikolaus Schaller <hns@xxxxxxxxxxxxx> wrote: > > Hi Ulf, > users did report a strange issue that the OMAP5 based Pyra does not > shutdown if a kernel 5.10.116 is used. > > Someone did a bisect and found that reverting > > 0d66b395210c5084c2b7324945062c1d1f95487a > > resp. upstream > > 66c915d09b942fb3b2b0cb2f56562180901fba17 > > solves it. > > I could now confirm that it also happens with v5.18.0. > But interestingly only on the Pyra handheld device and not > on the omap5evm (which is supported by mainline). > > The symptom is: > > a) without revert > > root@letux:~# poweroff > > Broadcast message from root@letux (console) (Sat Jan 1 01:08:25 2000): > > The system is going down for system halt NOW! > INIT: Sending processes the TERM signal > root@letux:~# [info] Using makefile-style concurrent boot in runlevel 0. > [....] Stopping cgroup management proxy daemon: cgproxy[....] Stopping cgroup management daemon: cgmanager[....] Stop[ ok bluetooth: /usr/sbin/bluetoothd. > [FAIL] Stopping ISC DHCP server: dhcpd failed! > dhcpcd[3055]: sending signal 15 to pid 2976 > dhcpcd[3055]: waiting for pid 2976 to exit > [ ok ] Shutting down ALSA...done. > [ ok ] Asking all remaining processes to terminate...done. > [ ok ] All processes ended within 2 seconds...done. > [ ok [[c[....] Stopping enhanced syslogd: rsyslogd. > [ ok ....] Deconfiguring network interfaces...done. > ^[[c[info] Saving the system clock. > [info] Hardware Clock updated to Sat Jan 1 01:08:30 UTC 2000. > [ ok ] Deactivating swap...done. > ^[[c[ 77.289332] EXT4-fs (mmcblk0p2): re-mounted. Quota mode: none. > [info] Will now halt. > > b) with reverting your patch > > root@letux:~# uname -a > Linux letux 5.18.0-letux-lpae+ #9678 SMP PREEMPT Mon May 30 18:02:28 CEST 2022 armv7l GNU/Linux > root@letux:~# poweroff > > Broadcast message from root@letux (console) (Sat Jan 1 01:39:15 2000): > > The system is going down for system halt NOW! > INIT: Sending processes the TERM signal > root@letux:~# [info] Using makefile-style concurrent boot in runlevel 0. > [FAIL] Stopping cgroup management proxy daemon: cgproxy[....] Stopping ISC DHCP server: dhcpd failed! > [....] Stopping cgroup management daemon: cgmanagerdhcpcd[3100]: sending signal 15 to pid 3013 > dhcpcd[3100]: waiting for pid 3013 to exit > [ ok ] Stopping bluetooth: /usr/sbin/bluetoothd. > [ ok ] Shutting down ALSA...done. > [ ok ] Asking all remaining processes to terminate...done. > [ ok ] All processes ended within 3 seconds...done. > [ ok [[c[....] Stopping enhanced syslogd: rsyslogd. > [ ok ....] Deconfiguring network interfaces...done. > ^[[c[info] Saving the system clock. > [info] Hardware Clock updated to Sat Jan 1 01:39:21 UTC 2000. > [ ok ] Deactivating swap...done. > ^[[c[ 44.563256] EXT4-fs (mmcblk0p2): re-mounted. Quota mode: none. > [info] Will now halt. > [ 46.917534] reboot: Power down > > > What I suspect is that we have multiple mmc interfaces and have > card detect wired up in the Pyra while it is ignored in the > EVM. Is it possible that __mmc_stop_host() never returns in > .shutdown_pre if card detect is set up (and potentially > shut down earlier)? > > Setup of mmc is done in omap5-board-common.dtsi and omap5.dtsi. > > Out Pyra has a non-upstream device tree where we use > omap5-board-common.dtsi and overwrite it by e.g. > > &mmc4 { /* second (u)SD slot (SDIO capable) */ > status = "okay"; > vmmc-supply = <&ldo2_reg>; > pinctrl-names = "default"; > pinctrl-0 = <&mmc4_pins>; > bus-width = <4>; > cd-gpios = <&gpio3 13 GPIO_ACTIVE_LOW>; /* gpio3_77 */ > wp-gpios = <&gpio3 15 GPIO_ACTIVE_HIGH>; /* gpio3_79 */ > }; > > But I have tried to remove the cd-gpois and wp-gpois. Or make the > mmc interface being disabled (but I may not have catched everything > in first place). > > Then I added some printk to mmc_stop_host() and __mmc_stop_host(). > > mmc_stop_host() is not called but __mmc_stop_host() is called 4 times. > There are 4 active MMC interfaces in the Pyra - 3 for (µ)SD slots > and one for an SDIO WLAN module. > > Now it looks as if 3 of them are properly teared down (two of them > seem to have host->slot.cd_irq >= 0) but on the fourth call > cancel_delayed_work_sync(&host->detect); does not return. This is > likely the location of the stall why we don't see a "reboot: Power down" > > Any ideas? I guess the call to cancel_delayed_work_sync() in __mmc_stop_host() hangs for one of the mmc hosts. This shouldn't happen - and indicates that there is something else being wrong. See more suggestions below. > > BR and thanks, > Nikolaus > > printk hack: > > void __mmc_stop_host(struct mmc_host *host) > { > printk("%s 1\n", __func__); > if (host->slot.cd_irq >= 0) { > printk("%s 2\n", __func__); > mmc_gpio_set_cd_wake(host, false); > printk("%s 3\n", __func__); > disable_irq(host->slot.cd_irq); > printk("%s 4\n", __func__); > } > > host->rescan_disable = 1; > printk("%s 5\n", __func__); My guess is that it's the same mmc host that causes the hang. I suggest you print the name of the host too, to verify that. Something along the lines of the below. printk("%s: %s 5\n", mmc_hostname(host), __func__); > cancel_delayed_work_sync(&host->detect); > printk("%s 6\n", __func__); Ditto. > } > > resulting log: > > [info] Will now halt. > [ 282.780929] __mmc_stop_host 1 > [ 282.784276] __mmc_stop_host 2 > [ 282.787735] __mmc_stop_host 3 > [ 282.791030] __mmc_stop_host 4 > [ 282.794235] __mmc_stop_host 5 > [ 282.797369] __mmc_stop_host 6 > [ 282.800918] __mmc_stop_host 1 > [ 282.804269] __mmc_stop_host 5 > [ 282.807541] __mmc_stop_host 6 > [ 282.810715] __mmc_stop_host 1 > [ 282.813842] __mmc_stop_host 2 > [ 282.816984] __mmc_stop_host 3 > [ 282.820175] __mmc_stop_host 4 > [ 282.823302] __mmc_stop_host 5 > [ 282.826449] __mmc_stop_host 6 > [ 282.830941] __mmc_stop_host 1 > [ 282.834076] __mmc_stop_host 5 > > --- here should be another __mmc_stop_host 6 > --- and reboot: Power down When/if you figured out that it's the same host that hangs, you could try to disable that host through the DTS files (add status = "disabled" in the device node, for example) - and see if that works. Kind regards Uffe