[PATCH v2 3/5] mmc: dw_mmc: Add locking to the CTO timer

shawn.lin@xxxxxxxxxxxxxx (Shawn Lin) · Tue, 17 Oct 2017 08:54:05 +0800

Hi Doug

On 2017/10/13 12:20, Doug Anderson wrote:
> Shawn,
> 
> On Thu, Oct 12, 2017 at 6:32 PM, Shawn Lin <shawn.lin at rock-chips.com> wrote:
>>
>> On 2017/10/13 4:11, Douglas Anderson wrote:
>>>
>>> This attempts to instill a bit of paranoia to the code dealing with
>>> the CTO timer.  It's believed that this will make the CTO timer more
>>> robust in the case that we're having very long interrupt latencies.
>>>
>>
>> Ack. It could help fix some problems observed.
>>
>>
>>> Note that I originally thought that perhaps this patch was being
>>> overly paranoid and wasn't really needed, but then while I was running
>>> mmc_test on an rk3399 board I saw one instance of the message:
>>>     dwmmc_rockchip fe320000.dwmmc: Unexpected interrupt latency
>>>
>>> I had debug prints in the CTO timer code and I found that it was
>>> running CMD 13 at the time.
>>>
>>> ...so even though this patch seems like it might be overly paranoid,
>>> maybe it really isn't?
>>>
>>> Presumably the bad interrupt latency experienced was due to the fact
>>> that I had serial console enabled as serial console is typically where
>>> I place blame when I see absurdly large interrupt latencies.  In this
>>> particular case there was an (unrelated) printout to the serial
>>> console just before I saw the "Unexpected interrupt latency" printout.
>>>
>>> ...and actually, I managed to even reproduce the problems by running
>>> "iw mlan0 scan > /dev/null" while mmc_test was running.  That not only
>>> does a bunch of PCIe traffic but it also (on my system) outputs some
>>> SELinux log spam.
>>>> Fixes: 03de19212ea3 ("mmc: dw_mmc: introduce timer for broken command
>>
>> transfer over scheme")
>>>
>>> Tested-by: Emil Renner Berthing <kernel at esmil.dk>
>>> Signed-off-by: Douglas Anderson <dianders at chromium.org>
>>> ---
>>>
>>> Changes in v2:
>>> - Removed extra "int i"
>>>
>>>    drivers/mmc/host/dw_mmc.c | 91
>>> +++++++++++++++++++++++++++++++++++++++++------
>>>    1 file changed, 81 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
>>> index 16516c528a88..50148991f30e 100644
>>> --- a/drivers/mmc/host/dw_mmc.c
>>> +++ b/drivers/mmc/host/dw_mmc.c
>>> @@ -403,6 +403,7 @@ static inline void dw_mci_set_cto(struct dw_mci *host)
>>>          unsigned int cto_clks;
>>>          unsigned int cto_div;
>>>          unsigned int cto_ms;
>>> +       unsigned long irqflags;
>>>          cto_clks = mci_readl(host, TMOUT) & 0xff;
>>>          cto_div = (mci_readl(host, CLKDIV) & 0xff) * 2;
>>> @@ -413,8 +414,24 @@ static inline void dw_mci_set_cto(struct dw_mci
>>> *host)
>>>          /* add a bit spare time */
>>>          cto_ms += 10;
>>>    -     mod_timer(&host->cto_timer,
>>> -                 jiffies + msecs_to_jiffies(cto_ms) + 1);
>>> +       /*
>>> +        * The durations we're working with are fairly short so we have to
>>> be
>>> +        * extra careful about synchronization here.  Specifically in
>>> hardware a
>>> +        * command timeout is _at most_ 5.1 ms, so that means we expect an
>>> +        * interrupt (either command done or timeout) to come rather
>>> quickly
>>> +        * after the mci_writel.  ...but just in case we have a long
>>> interrupt
>>> +        * latency let's add a bit of paranoia.
>>> +        *
>>> +        * In general we'll assume that at least an interrupt will be
>>> asserted
>>> +        * in hardware by the time the cto_timer runs.  ...and if it
>>> hasn't
>>> +        * been asserted in hardware by that time then we'll assume it'll
>>> never
>>> +        * come.
>>> +        */
>>> +       spin_lock_irqsave(&host->irq_lock, irqflags);
>>> +       if (!test_bit(EVENT_CMD_COMPLETE, &host->pending_events))
>>> +               mod_timer(&host->cto_timer,
>>> +                       jiffies + msecs_to_jiffies(cto_ms) + 1);
>>> +       spin_unlock_irqrestore(&host->irq_lock, irqflags);
>>
>>
>> IIUC, this change is beacuse you move
>> mci_writel(host, CMD, cmd_flags | SDMMC_CMD_START) before
>> setting up the timer, so there is a timing gap that the cmd_done
>> already comes and handled by dw_mci_interrupt->dw_mci_cmd_interrupt.
>> At this point, we don't need the cto timer at all.
> 
> As per below, if I don't move the mci_writel() before setting up the
> timer then there's still a race.  ...and actually that race was harder
> for me to write code for, but I invite you to try to see if it's
> somehow cleaner.
> 
> 
>>>    }
>>>      static void dw_mci_start_command(struct dw_mci *host,
>>> @@ -429,11 +446,11 @@ static void dw_mci_start_command(struct dw_mci
>>> *host,
>>>          wmb(); /* drain writebuffer */
>>>          dw_mci_wait_while_busy(host, cmd_flags);
>>>    +     mci_writel(host, CMD, cmd_flags | SDMMC_CMD_START);
>>> +
>>>          /* response expected command only */
>>>          if (cmd_flags & SDMMC_CMD_RESP_EXP)
>>>                  dw_mci_set_cto(host);
>>> -
>>> -       mci_writel(host, CMD, cmd_flags | SDMMC_CMD_START);
>>
>>
>>
>> But why? If we still keep the original logic, it's always correct
>> that cmd_done comes after setting up the cto timer. So could you
>> eleborate a bit more to help me understand the real intention here?
> 
> No matter which order you put things, there's a race one way or the
> other.  You need a lock.
> 
> Let's think about the old code you wrote.  You did this:
> 1. Start the CTO timer.
> 2. Start the command.
> 
> Now if you (somehow) take 20 ms to handle the interrupt, then this happens:
> 
> 1. Start the CTO timer.
> 2. Start the command.
> 3. Interrupt is pending, but interrupt handler doesn't run yet.
> 4. CTO timer fires and enqueues CTO timeout.
> 5. Interrupt finally fires.

OK, got it.

> 
> 
> Now normally things are pretty bad if you've got an interrupt latency
> of 20 ms.  ...and, in fact, I originally wrote up a commit that simply
> explained why the race didn't matter and was thinking of posting that
> instead of this one.  I wrote up:
> 
>       * Start a timer to detect missing cmd timeout if we expect a response.
>       *
>       * Note that we need to be a little careful about race conditions here
>       * since our timer will be racing with the actual hardware interrupt
>       * and things would get confused if both of them happened.
>       *
>       * We end up avoiding races here mostly because of our 10 ms "spare
>       * time" buffer above.  That's probably reliable enough because:
>       * - There's "guaranteed" "very little" time between setting the timer
>       *   and starting the command.  We're holding a spinlock (host->lock)
>       *   in all calls to this function so we won't get preempted.  Possibly
>       *   we could get interrupts still, but that shouldn't add up to
>       *   anything like the 10 ms spare time.
>       * - We expect that when the actual interrupt fires that our interrupt
>       *   routine should get called "relatively quickly" (compared to the
>       *   10 ms buffer) and will be able to cancel this timer.
> 
> ...but then I ran a whole bunch of tests and I found that, as far as I
> could tell, we actually _were_ getting a super long interrupt latency.
> Specifically I saw the printout "Unexpected interrupt latency" in my
> patch.  In order to see that printout in my patch (which even starts
> the command _before_ the CTO timer), the only explanation is bad
> interrupt latency, right?  Also: based on my past experience I believe
> it is possible to get upwards of 100 ms interrupt latency if you've
> got serial console enabled.  printk, especially printk from an
> interrupt context, can do some funny things.
> 

Right! It makes sense to me now.

> 
> ...but this stuff is always hard to get right, so if I messed up the
> above please let me know!  I tried to think of all of the cases so it
> would work no matter if delays happened in any random place but
> concurrency is hard.

Yes, it looks hard to get concurrency right. I have a comment for your
DRTO case(patch 5). Let's do some brainstorm there.

> 
> 
> -Doug
> 
> 
>