On Thu, Mar 26, 2020 at 9:43 PM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > On 2020-03-26 15:35, Prabhakar Kushwaha wrote: > > On Thu, Mar 26, 2020 at 7:49 PM Robin Murphy <robin.murphy@xxxxxxx> > > wrote: > >> > >> On 2020-03-26 1:36 pm, Prabhakar Kushwaha wrote: > >> > On Mon, Mar 23, 2020 at 10:28 PM Robin Murphy <robin.murphy@xxxxxxx> wrote: > >> >> > >> >> On 2020-03-23 3:21 pm, Prabhakar Kushwaha wrote: > >> >>> Hi All, > >> >>> > >> >>> I am facing issue on Marvell's ARM64 Thunder X2 with kdump kernel. > >> >>> Here network card is continuously giving following AER error > >> >>> [ 100.839168] igb 0000:09:00.1: AER: aer_status: 0x00004000, > >> >>> aer_mask: 0x00000000 > >> >>> [ 100.846463] igb 0000:09:00.1: AER: [14] CmpltTO (First) > >> >>> [ 100.861491] igb 0000:09:00.1: AER: aer_layer=Transaction Layer, > >> >>> aer_agent=Requester ID > >> >>> [ 100.869400] igb 0000:09:00.1: AER: aer_uncor_severity: 0x00062011 > >> >>> > >> >>> This error is not 100% reproducible. It happens 1 out of 4 try. > >> >>> > >> >>> This error goes away in following two scenarios > >> >>> A) Set iommu in bypass mode via bootargs iommu.passthrough=1 > >> >>> B) Wait for ~100ms in arm_smmu_device_reset of drivers/iommu/arm-smmu-v3.c > >> >>> if (reg & CR0_SMMUEN) { > >> >>> dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n"); > >> >>> WARN_ON(is_kdump_kernel() && !disable_bypass); > >> >>> mdelay(100); <-- Added delay > >> >>> arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0); > >> >>> } > >> >>> > >> >>> From A), it is clear that it is related to IOMMU > >> >>> From B), looks like during boot of kdump kernel, network card is still > >> >>> active and it has sent some request over PCIe. > >> >>> as GPBA_ABORT bit is set, no response/completion coming to PCIe > >> >>> controller hence "CmpltTO" error. > >> >>> > >> >>> Ideally before setting GPBA_ABORT bit, there should be some check for > >> >>> active transaction. if it is not possible, a wait should be done to > >> >>> assure that no more pending transaction left. > >> >> > >> >> In general there is no way to check for active transactions, and even if > >> >> there were, waiting for them to finish could mean waiting forever (if, > >> >> say, a device is continuously streaming to/from a ring buffer). > >> >> > >> >>> why any such delay has not been considered? > >> >> > >> >> The main aim here is to block any DMA left over from the crashed kernel > >> >> as quickly as possible, to minimise any further potential corruption of > >> >> memory (consider if a device was left writing to an IOMMU virtual > >> >> address that happened to have the same value as some physical address in > >> >> the crash kernel's reserved memory). The fact that an arbitrary delay > >> >> happens to give a 'nicer' result in one particular situation on one > >> >> particular platform is neither here nor there in general. > >> >> > >> > > >> > I agree. > >> > But we are depending upon kdump boot time and expecting devices to > >> > reach to idle state before setting GBPA_ABORT bit. > >> > >> So (ideally) stop depending on that, because like I said it's fragile > >> and doesn't generalise. > >> > >> > adding a delay will be fair and make it independent of kdump boot time. > >> > >> And what delay value is "fair" and appropriate for any device on any > >> system in any circumstance? > >> > > > > it is tough question. 1sec can be thought of. > > > >> >> Besides, this is *crash* kernel, so yeah, expect errors - something's > >> >> already gone badly wrong to get us here, and everything from then on is > >> >> merely a best-effort attempt to salvage what we can. Does it even make > >> >> sense to have AER enabled at this point? > >> >> > >> > > >> > i tried by disabling AER in kdump kernel. but it did not helped as > >> > network device become out of sync with respect to tx unit causing it > >> > to be hanged and it never recovered from there. Same can happen with > >> > other devices like SATA etc > >> > >> Any devices that the kdump kernel wants to use need to be fully reset > >> to > >> get them into a sane state anyway, don't they? I mean, what if the > >> crash > >> was *caused* by once of those devices going wrong in the first place? > >> Any devices that kdump *doesn't* care about shouldn't matter, since > >> nothing should be unmasking their interrupts regardless of what state > >> they're in. > >> > >> Assume some descriptor or pagetable entry got corrupted that caused > >> your > >> network device to access an invalid physical address downstream of the > >> SMMU and get an abort from that *before* the kdump kernel starts - is > >> waiting an extra 100ms at any point after that going to help? > >> > > I agree with you. in above scenaro, where device if faulty or done > > something wrong, waiting even hours is waste. > > > > I was just going through other iommu drivers as this problem is > > generic one and i found following patch > > > > commit 091d42e43d21b6ca7ec39bf5f9e17bc0bd8d4312 > > Author: Joerg Roedel <jroedel@xxxxxxx> > > Date: Fri Jun 12 11:56:10 2015 +0200 > > iommu/vt-d: Copy translation tables from old kernel > > > > If we are in a kdump kernel and find translation enabled in > > the iommu, try to copy the translation tables from the old > > kernel to preserve the mappings until the device driver > > takes over. > > > > This supports old and the extended root-entry and > > context-table formats. > > > > Tested-by: ZhenHua Li <zhen-hual@xxxxxx> > > Tested-by: Baoquan He <bhe@xxxxxxxxxx> > > Signed-off-by: Joerg Roedel <jroedel@xxxxxxx> > > > > I believe, similar kind of solution is required for SMMU also. > > There's a much more general problem: how to preserve pre-boot DMA > configurations because they are important to the new kernel (for > whatever reason). > > And in a number of cases, it makes perfect sense: framebuffer > scanning out boot animations, ongoing DMA for other *cough* agents > *cough* in the system... > > But I really don't like the idea of preserving the page tables across > a kdump kernel because for all we know, these page tables could be > horribly corrupted and mostly only make sense in the context of > the driver that created them. If I am correct, similar approach is used in GIC-ITS for LPI tables. Probability of corruption is still there. > Oh wait, this driver has long died, > along with the rest of the original kernel. > :( if this is the case than chance of foolproof and generic solution is very less. At least a delay should be considered before setting SMMU_ABORT bit for giving a chance of DMA getting completed. --pk _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec