On Thu, Nov 04, 2021 at 12:40:32PM -0700, Yi Fan wrote: > Reply inline. > > On Thu, Nov 4, 2021 at 11:56 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > > > On Thu, Nov 04, 2021 at 11:14:55AM -0700, Yi Fan wrote: > > > Resend the email using plain text. > > > > > > I found some kernel performance regression issues that might be > > > related w/ 4.14.y LTS commit. > > > > > > 4.14.y commit: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v4.14.253&id=27d185697322f9547bfd381c71252ce0bc1c0ee4 > > > > > > The issue is observed when "console=" is used as a kernel parameter to > > > disable the kernel console. > > > > What exact "performance issue" are you seeing? > > > [YF] one kernel thread was randomly blocked for more than ~40 > milliseconds, causing a certain task to fail to process in time. > [YF] the issue is highly random on a single device. But it might > happen a few times per 24 hours on a certain percentage of devices. > The overall percentage of devices that show the issue seems quite > stable over a long period of time (somehow the magic number is ~40%.). > [YF] local test on a pool of devices does not show any correlation w/ > any particular devices. > [YF] local test after reverting the above single commit passes, no > issue is observed. And what type of device is this? If you see this thread: https://lore.kernel.org/r/f19c18fd-20b3-b694-5448-7d899966a868@xxxxxxxxxxxx it looks like chromeos devices have now disabled this change, and there was a long discussion about possible issues and solutions. Can you try the patch set referenced in that thread to see if that resolves the issue for you or not? Given that I have not seen any reports of this being an issue since over a year ago, odds are it has been resolved already with some change that we probably also need to backport to 4.14.y. So any help in identifying that change would be appreciated. > > And what kernel version are you seeing it on? > > > [YF] it was first found on some products w/ kernel version 4.14.210. > through bisection, we located the commit on 4.14.200. > > > > I browsed android common kernel logs and the upstream stable kernel > > > tree, found some related changes. > > > > > > printk: handle blank console arguments passed in. (link: > > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.14.15&id=3cffa06aeef7ece30f6b5ac0ea51f264e8fea4d0) > > > Revert "init/console: Use ttynull as a fallback when there is no > > > console" (link: > > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.14.15&id=a91bd6223ecd46addc71ee6fcd432206d39365d2) > > > > > > It looks like upstream also noticed the regression introduced by the > > > commit, and the workaround is to use "ttynull" to handle "console=" > > > case. But the "ttynull" was reverted due to some other reasons > > > mentioned in the commit message. > > > > > > Any insight or recommendation will be appreciated. > > > > What problem exactly are you now seeing? And does it also happen on > > 5.15? > > > [YF] we do not perform any tests on 5.15 yet. so no idea about whether > the issue happens on 5.15. How about any other newer stable kernel version like 5.4.y or 5.10.y? thanks, greg k-h