Abdul Haleem <abdhalee@xxxxxxxxxxxxxxxxxx> writes: > On Wed, 2017-09-20 at 21:42 +1000, Michael Ellerman wrote: >> Abdul Haleem <abdhalee@xxxxxxxxxxxxxxxxxx> writes: >> >> > Hi, >> > >> > Dynamic CPU remove operation resulted in Kernel Panic on today's >> > next-20170915 kernel. >> > >> > Machine Type: Power 7 PowerVM LPAR >> > Kernel : 4.13.0-next-20170915 >> > config : attached >> > test: DLPAR CPU remove >> > >> > >> > dmesg logs: >> > ---------- >> > cpu 37 (hwid 37) Ready to die... >> > cpu 38 (hwid 38) Ready to die... >> > cpu 39 (hwid 39) >> > ******* RTAS CReady to die... >> > ALL BUFFER CORRUPTION ******* >> >> Cool. Does that come from RTAS itself? I have never seen that happen >> before. > > Not sure, the var logs does not have any messages captured. This is > first time we hit this type of issue. Yeah it is from RTAS: # lsprop /proc/device-tree/rtas/linux,rtas-base /proc/device-tree/rtas/linux,rtas-base 1eca0000 (516554752) # lsprop /proc/device-tree/rtas/rtas-size /proc/device-tree/rtas/rtas-size 01360000 (20316160) # dd if=/dev/mem bs=4096 skip=126112 count=4960 of=rtas.bin # strings rtas.bin | grep "RTAS CALL BUFFER" ******* RTAS CALL BUFFER CORRUPTION ******* So we were doing an RTAS call and RTAS itself detected that the call buffer was corrupted. I'm not sure how it detects that, but something is definitely screwed up. >> Is this easily reproducible? > > I am unable to reproduce it again. I will keep an eye on our CI runs for > few more runs. OK thanks. cheers -- To unsubscribe from this list: send the line "unsubscribe linux-next" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html