On Tue, Mar 19, 2019 at 9:40 AM Daniel Drake <drake@xxxxxxxxxxxx> wrote: > On Mon, Mar 18, 2019 at 7:01 PM Oleksij Rempel <linux@xxxxxxxxxxxxxxxx> wrote: > > First of all, I would suggest to investigate if DMA is working. > > So it's only requesting a single block of data. I found another hint > in mmc_blk_data_prep() which lead me to alcor_init_mmc(): > > mmc->max_segs = AU6601_MAX_DMA_SEGMENTS; > mmc->max_seg_size = AU6601_MAX_DMA_BLOCK_SIZE; > mmc->max_blk_size = mmc->max_seg_size; > mmc->max_blk_count = mmc->max_segs; > > and AU6601_MAX_DMA_SEGMENTS is 1. > > So max_blk_count is 1 and this means block.c will always choose a > single block read, which alcor.c does not DMA-accelerate. > > I'll come back to look in more detail, but if you have any quick > comments at this stage that would be handy. This means you actually have two separate problems: - as you noted, transfers are done using MMIO rather than DMA, so it takes longer to get the data over the wire. - in addition, writing a single 512 block at a time will lead to bad write performance on many devices, since they need to write less than a page at a time, and have to copy data around a lot internally. If the hardware cannot do scatter/gather DMA, you probably want to do this in software and copy the data into a temporary buffer so you can always transfer up to e.g. 64KB at a time. Arnd