On 9/13/21 12:55 AM, Kiwoong Kim wrote:
This patch is to raise recovery in some abnormal conditions using an vendor specific interrupt for some cases, such as a situation that some contexts of a pending request in the host isn't the same with those of its corresponding UPIUs if they should have been the same exactly. The representative case is shown like below. In the case, a broken UTRD entry, for internal coherent problem or whatever, that had smaller value of PRDT length than expected was transferred to the host. So, the host raised an interrupt of transfer complete even if device didn't finish its data transfer because the host sees a fetched version of UTRD to determine if data tranfer is over or not. Then the application level seemed to recogize this as a sort of corruption and this symptom led to boot failure.
How can a UTRD entry be broken? Does that perhaps indicate memory corruption at the host side? Working around host-side memory corruption in a driver seems wrong to me. I think the root cause of the memory corruption should be fixed.
+static irqreturn_t exynos_ufs_isr(struct ufs_hba *hba) +{ + struct exynos_ufs *ufs = ufshcd_get_variant(hba); + u32 status; + unsigned long flags; + + if (!hba->priv) return IRQ_HANDLED;
Please verify patches with checkpatch before posting these on the linux-scsi mailing list. The above if-statement does not follow the Linux kernel coding style.
+ if (status & RX_UPIU_HIT_ERROR) { + pr_err("%s: status: 0x%08x\n", __func__, status); + hba->force_reset = true; + hba->force_requeue = true; + scsi_schedule_eh(hba->host); + spin_unlock_irqrestore(hba->host->host_lock, flags); + return IRQ_HANDLED; + } + return IRQ_NONE; +}
So the above code unlocks the host_lock depending on whether or not status & RX_UPIU_HIT_ERROR is true? Yikes ... Additionally, in the above code I found the following pattern: unsigned long flags; [ ... ] spin_unlock_irqrestore(hba->host->host_lock, flags); Such code is ALWAYS wrong. The value of the 'flags' argument passed to spin_unlock_irqrestore() must come from spin_lock_irqsave(). Bart.