On Mon, 22 Sep 2008, David Brownell wrote: > On Monday 22 September 2008, Eric Miao wrote: >>>> + ? ? data->rx_buf = kmalloc(MAX1111_RX_BUF_SIZE, GFP_KERNEL); >>>> + ? ? if (!data->rx_buf) { >>>> + ? ? ? ? ? ? kfree(data->tx_buf); >>>> + ? ? ? ? ? ? return -ENOMEM; >>>> + ? ? } >>> >>> Allocating such small buffers using kmalloc seems pretty inefficient. >>> At the very least, I would allocate both buffers at once. But quite >>> frankly I don't get why you don't just make these buffers part of >>> struct max1111_data. This would even make the structure smaller! >>> >> >> I originally place the buffer within "struct max1111_data" but received >> a mail from David Brownell suggesting using a kmalloc() buffer, so that >> DMA mode will work better with the cache alignment and trailing bytes, >> though PIO can just work happily. I don't know the specific reason for >> this, honestly. > > The phrase is "cache line sharing". If you make sure the buffer doesn't > share its cache line with something the CPU may modify while the DMA is > happening, you're OK ... and using a dedicated kmalloc buffer guarantees > that. (So will sticking buffers at the end of a struct, like max1111_data, > annotated as "____cacheline_aligned", in many cases.) Another alternative: typedef ____cacheline_aligned char cacheline_barrier_t[0]; struct max1111_data { u8 buffer[BUF_SIZE]; cacheline_barrier_t barrier; int whatever; ... long something_else; cacheline_barrier_t barrier2; u8 also_gets_its_own_cacheline[BUF_SIZE]; }; One could also have ARM define: typedef ____cacheline_aligned char dma_padding_t[0]; And x86, etc. would define: typedef char dma_padding_t[0]; Then other arches don't get wasted space.