Yinghai Lu <yinghai at kernel.org> writes: > not sure if it is related: I don't think it is. > for crashing kernel, it could do early_memtest to check if some device > are still do dma operation. Devices doing DMA in general are not a problem in the kdump kernel because we are using an area of memory that has been reserved since the beginning of time and no DMA's should be targeting it. The challenge is how to regain control of the IOMMU. > When I use kexec to start second kernel, if enable the early_memtest > in second kernel, it will find some pages RAM are BAD, > and it will mark them and not use them. memtest=1 should be good enough. > Fresh restart will not report there is any BAD ram in the same system. I assume you are not talking kdump here. On-going DMA in the case of kexec indicates some device driver isn't shutting itself down when it's shutdown method is called. Odds are it is a network controller that doesn't stop DMA when it is brought down or it is, possibly a really weird disk driver. If you are seeing this with the kdump kernel this may indeed indicate an IOMMU reinitialization problem. Eric