On Tue, Jan 16, 2018 at 4:17 PM, Vinson Lee <vlee@xxxxxxxxxxxxxxx> wrote: > On Wed, Jan 10, 2018 at 4:52 PM, Vinson Lee <vlee@xxxxxxxxxxxxxxx> wrote: >> On Fri, Jan 5, 2018 at 8:32 AM, Bart Van Assche <Bart.VanAssche@xxxxxxx> wrote: >>> On Thu, 2018-01-04 at 14:32 -0800, Vinson Lee wrote: >>>> HP ProLiant DL360p Gen8 with Smart Array P420i boots to the login >>>> prompt and hangs with Linux 4.13 or later. I cannot log in on console >>>> or SSH into the machine. Linux 4.12 and older boot fine. >>>> >>>> I see these messages on the console. >>>> >>>> [ 242.843206] INFO: task scsi_eh_2:465 blocked for more than 120 seconds. >>>> [ 242.877835] Not tainted 4.15.0-041500rc6-generic #201712312330 >>> >>> It seems like something got stuck in the block layer. The traditional way to >>> debug this is to analyze the information that is available under >>> /sys/kernel/debug/block. However, since login is not possible we can't use >>> that approach. Would it be possible for you to check whether this has been >>> resolved in kernel v4.15-rc6, and if not, bisect this? >>> >>> Thanks, >>> >>> Bart. >> >> Hi. >> >> The machine still hangs with Linux 4.15-rc6. >> >> I did a bisect. The hang is introduced with Linux 4.13-rc1 commit >> c5cb83bb337c25caae995d992d1cdf9b317f83de "genirq/cpuhotplug: Handle >> managed IRQs on CPU hotplug". >> >> There is a startup script that disables hyperthreading by offlining >> sibling CPUs. >> >> for CPU in $(cut -s -d, -f2 >> $SYS_PATH/cpu*/topology/thread_siblings_list | sort -un); do >> echo 0 > /sys/devices/system/cpu/cpu$CPU/online >> done >> >> If the above script is not run, the machine does not hang with Linux 4.13. >> >> Cheers, >> Vinson > > Hi. > > HP ProLiant DL360p Gen8 still hangs with Linux 4.15-rc8. > > I see machine hangs now too with another machine with Microsemi > Adaptec RAID 71605 and aacraid driver on both Linux 4.13 and Linux > 4.15-rc8. > > Cheers, > Vinson Hi. Offlining CPUs still trigger hangs on Linux 4.16-rc2. Is there more debugging info I can provide? Cheers, Vinson