Re: [PATCH] mmc: core: Check for timeout before checking mmc device state

Ulf Hansson <ulf.hansson@xxxxxxxxxx> · Mon, 18 May 2015 11:39:07 +0200

On 11 May 2015 at 23:55, Matt Bennett <Matt.Bennett@xxxxxxxxxxxxxxxxxxx> wrote:
> Hi Uffe,
>
> We are using the Octeon mmc host driver supplied from the Cavium SDK (I
> don't believe it is released to upstream linux). We have both a mmc
> flash memory device and an SD card reader attached to the mmc bus.

I assume that's as two separate instances of an mmc host?

BTW, I don't think these are attached to the mmc bus. It's probably
the platform bus or another subsystem specific bus, right?

>
> In the host driver code their is a mutex which must be obtained before
> the driver can access the mmc bus. This stops the mmc flash and SD card
> reader being written to in parallel (otherwise the signal on the bus
> will be corrupted). It doesn't prevent parallel requests, it's just that
> the second request will block on this mutex until the first request has
> been completed.

I think I will stop here. This becomes too much of a hypothetical issue.
Considering the above statement, I wonder if this couldn't be handled
in the mmc host driver instead.

Anyway, to continue to discuss $subject patch, I first think we should
worry about to get the Octeon driver upstreamed.

>
> In our specific case the following is occurring:
>
> 1. mmc_blk_part_switch() is called to switch partition on the mmc flash
> device. This calls mmc_switch with a timeout_ms value of
> 'card->ext_csd.part_time' which is 10ms in this case.
>
> 2. In __mmc_switch() the command to switch partition is sent to the mmc
> flash.
>
> 3. Between the command being sent to the flash and then the host polling
> the status of the device (no busy detection hardware) a read or write
> operation is begun on the SD card (in our case a Specification Version
> 2.00 card). In my testing I have seen the bus be blocked up to 800ms
> while completing this operation.
>
> 4. The host polls the device for the status but blocks the first time on
> the mutex for ~800ms while the SD card operation completes.
>
> 5. Finally the host gains the mutex and gets the status from the flash
> device.
>
> In my testing at this stage the status was never still
> 'R1_STATE_PRG' (it has been 800ms since the command was sent after all).
> However the timeout check fails because it has been 800ms compared to
> the original timeout_ms value passed in of 10ms. Therefore even though
> the device has left the 'R1_STATE_PRG' state we return early with an
> error that eventually gets printed to the log. This does not affect any
> functionality as the host will simply try to switch the partition again
> and if the bus does not block again then there are no issues.
>
> By putting the timeout check before we read the status of the device
> (and potentially block for longer than the timeout) we don't return an
> early error if the device has indeed left the programming state. We
> might as well continue through the function as after we return the error
> the host is just going to issue the command again.
>

Thanks for elaborating!

> Please excuse me if I have missed something fundamental.
>
> Thanks,
> Matt
>

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html