On Mon, Nov 08, 2021 at 11:17:07AM -0800, Yi Fan wrote: > On Mon, Nov 8, 2021 at 12:00 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > > > On Thu, Nov 04, 2021 at 12:40:32PM -0700, Yi Fan wrote: > > > Reply inline. > > > > > > On Thu, Nov 4, 2021 at 11:56 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > On Thu, Nov 04, 2021 at 11:14:55AM -0700, Yi Fan wrote: > > > > > Resend the email using plain text. > > > > > > > > > > I found some kernel performance regression issues that might be > > > > > related w/ 4.14.y LTS commit. > > > > > > > > > > 4.14.y commit: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v4.14.253&id=27d185697322f9547bfd381c71252ce0bc1c0ee4 > > > > > > > > > > The issue is observed when "console=" is used as a kernel parameter to > > > > > disable the kernel console. > > > > > > > > What exact "performance issue" are you seeing? > > > > > > > [YF] one kernel thread was randomly blocked for more than ~40 > > > milliseconds, causing a certain task to fail to process in time. > > > [YF] the issue is highly random on a single device. But it might > > > happen a few times per 24 hours on a certain percentage of devices. > > > The overall percentage of devices that show the issue seems quite > > > stable over a long period of time (somehow the magic number is ~40%.). > > > [YF] local test on a pool of devices does not show any correlation w/ > > > any particular devices. > > > [YF] local test after reverting the above single commit passes, no > > > issue is observed. > > > > And what type of device is this? > [YF] it happens on multiple devices on the 4.14.y kernel. (sorry > cannot disclose the device type here.) That's not helpful :( Can you say "server" or "tiny device you hold in your hand"? How about architecture type? > > If you see this thread: > > https://lore.kernel.org/r/f19c18fd-20b3-b694-5448-7d899966a868@xxxxxxxxxxxx > > it looks like chromeos devices have now disabled this change, and there > > was a long discussion about possible issues and solutions. > > > > Can you try the patch set referenced in that thread to see if that > > resolves the issue for you or not? Given that I have not seen any > > reports of this being an issue since over a year ago, odds are it has > > been resolved already with some change that we probably also need to > > backport to 4.14.y. > > > > So any help in identifying that change would be appreciated. > > > > [YF] thanks for the context. I did not find a clear patch that seems > to solve this issue yet. > [YF] for the time being, reverting the offending commit seems the > safest solution for the 4.14.y. What about for the 4.19.y kernel tree? Why is this limited to just 4.14.y? Can you send a patch that reverts this from 4.14 that explains why it should be removed? thanks, greg k-h