> On Thu, Aug 27, 2015 at 7:11 AM, <ygardi@xxxxxxxxxxxxxx> wrote: >>> On Tue, Aug 25, 2015 at 7:36 AM, <ygardi@xxxxxxxxxxxxxx> wrote: >>>>> On Aug 21, 2015 3:10 PM, "Yaniv Gardi" <ygardi@xxxxxxxxxxxxxx> wrote: >>>>>> >>>>>> Add a write memory barrier to make sure descriptors prepared are >>>>>> actually >>>>>> written to memory before ringing the doorbell. We have also added >>>>>> the >>>>>> write memory barrier after ringing the doorbell register so that >>>>>> controller sees the new request immediately. >>>>>> >>>>>> Signed-off-by: Yaniv Gardi <ygardi@xxxxxxxxxxxxxx> >>>>>> >>>>>> --- >>>>>> drivers/scsi/ufs/ufshcd.c | 6 ++++++ >>>>>> 1 file changed, 6 insertions(+) >>>>>> >>>>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >>>>>> index fef0660..876148b 100644 >>>>>> --- a/drivers/scsi/ufs/ufshcd.c >>>>>> +++ b/drivers/scsi/ufs/ufshcd.c >>>>>> @@ -833,6 +833,8 @@ void ufshcd_send_command(struct ufs_hba *hba, >>>>>> unsigned int task_tag) >>>>>> ufshcd_clk_scaling_start_busy(hba); >>>>>> __set_bit(task_tag, &hba->outstanding_reqs); >>>>>> ufshcd_writel(hba, 1 << task_tag, >>>>>> REG_UTP_TRANSFER_REQ_DOOR_BELL); >>>>>> + /* Make sure that doorbell is committed immediately */ >>>>>> + wmb(); >>>>> >>>>> Is this really necessary? Is there a measurable difference? >>>> >>>> I'm not sure if there is a measurable difference, but as the Door-Bell >>>> register is the one that actually responsible for the HW execution of >>>> the >>>> requests, anyhow, it's recommended to its value will be written >>>> instantly to the memory. >>> >>> A barrier doesn't guarantee speed, only ordering. Unless you can >>> measure the difference, you should not have it. >> >> Rob, >> let me have an example: >> context#1 updates outstanding_reqs variable and write(DOOR_BELL) >> context#2 upon interrupt of a request completion the following happens: >> report completion on each one of the bits in: >> outstanding_reqs ^ read(DOOR_BELL); >> >> 0. let's assume the DOOR_BELL = 0x1 (which means 1 active request in >> slot 0) >> 1. context#1: update the DOOR_BELL to be 0x3; (2 active requests: in >> slot >> 0 and 1) >> 2. the new value 0x3 is still not written to the DR so DORR_BELL is >> still >> 0x1, but outstanding_reqs is already updated = 0x3 >> 3. the request in slot 0 just completed, and interrupt happens, so >> DORR_BELL is now 0 (request in slot 0 completed) >> 4. context#2: outstanding_reqs ^ read(DOOR_BELL) = 0x3 ^ 0x0 = 0x3 => >> wrong conclusion since the request in slot 1 never completed, and >> actually >> never started. > > Barriers alone will never solve this problem. They may narrow the > window possibly, but the problem is still there. What you have to have > is a spinlock around all accesses to both outstanding_reqs and > doorbell register. And guess what, spinlocks have appropriate barriers > to ensure visibility of what they protect. Or perhaps the h/w provides > another way to signal what slots have completed. Using the same > register for doorbell and completion status is not ideal. > can i assume spin_lock_irqsave() and spin_unlock_irqrestore() both provide barriers ? i couldn't find the barrier instruction when following the call chain... > Rob > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html