On Sat, Aug 8, 2020 at 5:09 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > On Sat, Aug 08, 2020 at 04:19:42PM -0500, William Tambe wrote: > > On Sat, Aug 8, 2020 at 4:17 PM William Tambe <tambewilliam@xxxxxxxxx> wrote: > > > > > > On Sat, Aug 8, 2020 at 1:21 PM William Tambe <tambewilliam@xxxxxxxxx> wrote: > > > > > > > > I am having an issue in my kernel where delayed_put_task_struct() used > > > > through call_rcu() by put_task_struct_rcu_user() never gets called. > > > > > > I am able to trace this issue to invoke_rcu_core() not getting called > > > in __call_rcu_core() due to rcu_is_watching() always returning true. > > That in fact should be the common case. Normally, you would be invoking > call_rcu() and thus __call_rcu_core() from a context that RCU is watching. > > But what happens after that in __call_rcu_core()? > > > > Any idea why I am seeing such an issue ? > > One way would be if every single one of your call_rcu() invocations was > done with irqs disabled. And if the scheduling-clock interrupt was turned > off. And if the CPU in question never received any other interrupts. > > As in all of those things have to be in effect in order to indefinitely > postpone the call to delayed_put_task_struct(). In this case, v5.8's > __call_rcu_core() would always exit via this path: > > if (irqs_disabled_flags(flags) || cpu_is_offline(smp_processor_id())) > return; > > > Also, the issue is not happening when using highres=off . > > Might highres=off be forcing the scheduling-clock interrupt to be > enabled? > > > > > Any idea ? > > If you are running oldish kernels and the CPU in question is a nohz_full > CPU, the scheduling-clock interrupt would be turned off. (In more recent > kernel versions, RCU will force it back on if things are not progressing.) I am running v5.8. I further observed that without highres=off, the function tick_nohz_handler() is not getting called, hence update_process_times() and rcu_sched_clock_irq() are not getting called. How can I debug why tick_nohz_handler() is not getting called when booting without highres=off ? The timer interrupt is implemented as follow: void timer_intr (void) { arch_local_irq_disable(); irq_enter(); struct clock_event_device *e = per_cpu(clkevtdevs, smp_processor_id()); e->event_handler(e); irq_exit(); arch_local_irq_enable(); } > > To say more, I would need your exact kernel version (including any > patches and any other out-of-tree source code) and your .config file. I am using v5.8; currently unable to release out-of-tree source. The defconfig is as follow: CONFIG_NO_HZ_IDLE=y CONFIG_HIGH_RES_TIMERS=y CONFIG_PREEMPT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_KALLSYMS_ALL=y CONFIG_USERFAULTFD=y CONFIG_EMBEDDED=y # CONFIG_SLUB_DEBUG is not set CONFIG_SIMHDD=y # CONFIG_MQ_IOSCHED_DEADLINE is not set # CONFIG_MQ_IOSCHED_KYBER is not set CONFIG_BINFMT_MISC=y CONFIG_NET=y CONFIG_PACKET=y CONFIG_PACKET_DIAG=y CONFIG_UNIX=y CONFIG_UNIX_DIAG=y CONFIG_INET=y CONFIG_INET_UDP_DIAG=y CONFIG_INET_RAW_DIAG=y CONFIG_INET_DIAG_DESTROY=y # CONFIG_IPV6 is not set CONFIG_BRIDGE=y CONFIG_NETLINK_DIAG=y # CONFIG_WIRELESS is not set # CONFIG_ETHTOOL_NETLINK is not set CONFIG_DEVTMPFS=y CONFIG_DEVTMPFS_MOUNT=y CONFIG_BLK_DEV_LOOP=y CONFIG_VT_HW_CONSOLE_BINDING=y # CONFIG_LEGACY_PTYS is not set # CONFIG_VGA_CONSOLE is not set # CONFIG_VIRTIO_MENU is not set # CONFIG_VHOST_MENU is not set CONFIG_EXT4_FS=y CONFIG_TMPFS=y CONFIG_TMPFS_POSIX_ACL=y # CONFIG_MISC_FILESYSTEMS is not set CONFIG_NFS_FS=y CONFIG_NFS_V3_ACL=y CONFIG_NFS_V4=y CONFIG_NFS_V4_1=y CONFIG_DEBUG_INFO=y CONFIG_GDB_SCRIPTS=y CONFIG_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y CONFIG_SCHED_STACK_END_CHECK=y CONFIG_DEBUG_MEMORY_INIT=y CONFIG_PANIC_TIMEOUT=1 CONFIG_SOFTLOCKUP_DETECTOR=y CONFIG_WQ_WATCHDOG=y # CONFIG_RCU_TRACE is not set CONFIG_RCU_EQS_DEBUG=y # CONFIG_RUNTIME_TESTING_MENU is not set CONFIG_MEMTEST=y > > Thanx, Paul