Re: [PATCH] mmc: sd: Meet alignment requirements for raw_ssr DMA

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[...]

>> > diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c
>> > index 73c762a..f6f40a1 100644
>> > --- a/drivers/mmc/core/sd.c
>> > +++ b/drivers/mmc/core/sd.c
>> > @@ -223,6 +223,7 @@ static int mmc_decode_scr(struct mmc_card *card)
>> >
>> >  static int mmc_read_ssr(struct mmc_card *card)
>> >  {
>> >
>> >         unsigned int au, es, et, eo;
>> >
>> > +       u32 *raw_ssr;
>> >
>> >         int i;
>> >
>> >         if (!(card->csd.cmdclass & CCC_APP_SPEC)) {
>> >
>> > @@ -231,14 +232,21 @@ static int mmc_read_ssr(struct mmc_card *card)
>> >
>> >                 return 0;
>> >
>> >         }
>> >
>> > -       if (mmc_app_sd_status(card, card->raw_ssr)) {
>>
>> So the struct mmc_card, which contains "u32 raw_ssr[16]", is already
>> allocated via using a kzalloc().
>> As kzalloc() returns a physically contiguous allocated memory, using
>> DMA for reading the data into the memory pointed to by card->raw_ssr
>> should be safe.
>>
>> Unless I am missing something, of course.
>
> Sadly you are. This is incorrect because the raw_ssr field of struct mmc_card
> may not begin or end at a cache-line boundary. On systems where mapping a
> buffer for DMA requires cache invalidation we may therefore lose data if it
> shares a cache line with some other data. In this case let's say we have

Yes, of course - you are right! Apologize for my ignorance!

> something like this:
>
> 0x1000: u32 raw_scr[2];
> 0x1008: u32 raw_ssr[16];
>
> Now let's say we have 16 byte cache lines and we try to map raw_ssr for DMA.
> If the architecture requires that we invalidate the memory being mapped in
> caches (in order to avoid any writebacks clobbering the DMA'd data) then we
> have a problem because our choice is either to:
>
> - Writeback the line that overlaps with raw_scr, then invalidate it along with
>   the rest of the cache lines covering raw_ssr. This won't do because there's
>   nothing stopping us from pulling the line back into the cache whilst the DMA
>   occurs on some systems - say we access raw_scr whilst the DMA to raw_ssr
>   happens or the CPU just speculatively prefetches something. The DMA API
>   cannot detect the former & would have no way to deal with it if it could,
>   and on systems where the latter is possible we have no choice but to
>   invalidate the cache lines again after the DMA is complete. What would we do
>   with the first line then? We can't write it back to preserve raw_scr because
>   it includes some potentially stale copy of the start of raw_ssr, which we
>   would then clobber the DMA'd data with.
>
> - Or, we just invalidate the line which overlaps with raw_scr. We can't do
>   this because we might throw away some part of raw_scr.
>
> So we have no choice but to align the buffers used for DMA at a cache line
> boundary. This is what the paragraph in Documentation/DMA-API.txt is about &
> the MMC code currently fails to meet the requirements it sets out. Allocating
> the buffer to be DMA'd to with kmalloc will meet that alignment requirement.
> One possible alternative would be to ensure that the raw_ssr field of struct
> mmc_card is always aligned at an ARCH_DMA_MINALIGN boundary within the struct,
> with nothing after it until another ARCH_DMA_MINALIGN boundary. That seems
> like it'd be a bit messy though, and probably easy for someone to break.
>
> Of course when a system has DMA that is coherent with caches there's generally
> no need to worry about any of this, but users of the DMA API generally can't
> rely on that, again as Documentation/DMA-API.txt says. Any buffers mapped for
> DMA should be ensured to be cache-line aligned at minimum giving the
> architecture code backing the DMA API a chance to do the right thing for the
> given system.

Thanks a lot for providing this detailed description to the problem.
It really helped to refresh my mind for how this works!

>
> Thanks,
>     Paul
>
>>
>> > +       raw_ssr = kmalloc(sizeof(card->raw_ssr), GFP_KERNEL);
>> > +       if (!raw_ssr)
>> > +               return -ENOMEM;
>> > +

Creating this bounce buffer does of course comes with a small cost.
Maybe we could neglect its impact!?

However we could also make the card->raw_ssr be cacheline aligned, via
using the __cacheline_aligned attribute for it.

Also, this isn't the only case when the mmc core creates bounce
buffers while reading card registers during the card initialization.

Perhaps a better method is to use the __cacheline_aligned for these
buffers that needs to be DMA-able (requests which is
MMC_DATA_READ|WRITE)). Of course that change would affect/increase the
number of bytes allocated for each card struct (due to padding), but
since this is just a single struct being allocated per card, this
shouldn't be an issue.

What do you think?

>> > +       if (mmc_app_sd_status(card, raw_ssr)) {
>> >
>> >                 pr_warn("%s: problem reading SD Status register\n",
>> >
>> >                         mmc_hostname(card->host));
>> >
>> > +               kfree(raw_ssr);
>> >
>> >                 return 0;
>> >
>> >         }
>> >
>> >         for (i = 0; i < 16; i++)
>> >
>> > -               card->raw_ssr[i] = be32_to_cpu(card->raw_ssr[i]);
>> > +               card->raw_ssr[i] = be32_to_cpu(raw_ssr[i]);
>> > +
>> > +       kfree(raw_ssr);
>> >
>> >         /*
>> >
>> >          * UNSTUFF_BITS only works with four u32s so we have to offset the
>> >
>> > --
>> > 2.10.2
>>

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux