At Thu, 04 Sep 2008 12:23:54 +0100, Pawel MOLL wrote: > > Hi Guys, > > Recently I was investigating an issue with capturing audio using USB > Audio Class device on a sh4-based board. "Bad voice quality" was > reported... > > Finally I have traced the problem to something which is (unfortunately) > well known to sh developers as a D-cache aliasing (or synonym) problem. > > Briefly speaking: due to some MMU design decisions, one can have two > different virtual address pointing to the same physical location, which > is fine, but going via different cache slots! So if there was a value of > "0" in the memory and user "A" will write "1" there, user "B" will still > read "0"... > > The solution is to ensure all TLB entries (so virtual memory areas) are > beginning from a 16kB-aligned virtual address. Otherwise it is necessary > to flush the cache between accesses from "A" and "B" sides. > > And now. The USB Audio Class driver (sound/usb/usbaudio.c) is allocating > the sound buffer like this... > > static int snd_pcm_alloc_vmalloc_buffer(struct snd_pcm_substream *subs, size_t size) > { > [...] > runtime->dma_area = vmalloc(size); > [...] > } > > ... and vmalloc will return a page(4k)-aligned pointer, possibly not > 16k-aligned one. This is the source of all evil ;-) Well, the main problem is the cache coherency with mmap in general, and not about the page size. The 4k and 16k things are just a coincidence, I'd say. > When using RW transfers everything is fine, as data is memcopied between > two independent buffers. > > In a case of MMAP mode we will end up with such a situation: > > 1. The driver will memcpy data from an URB to the buffer. It will > populate several cache lines. Let's call it a "kernel cache". > 2. As the library has mapped the non-16k-aligned address via different > cache line ("user cache"), it will read some rubbish from the physical > memory, populating the "user cache" by the way. > 3. Some time later the "kernel cache" is flushed, writing the data to > the memory. > 4. Some new data from URB is entering "kernel cache". > 5. The library will access mmap-ed area again, going via the "user > cache", which may or may not reflect the correct data (depending on the > fact was the "user cache" flushed or not) etc. > > Of course this cycle is completely not deterministic, so some of the > "kernel cache" lines will be flushed before being accessed from user > space, other not... The final effect is... hmm... bizarre :-) First, of > course, you will get a (probably loud) glitch (rubbish from buffer's > underlying memory, before the first valid data is written back), and > then something that could be described as an "echo" ;-) I mean - you are > capturing "1 2 3 4 5 6 7..." and the result is (G stands for Glitch ;-) > "G 1 23 3 4 4 6 67..." > > As a quick-and-dirty work-around I have modified > snd_pcm_alloc_vmalloc_buffer() to allocate always 12kB more and then use > 16k-aligned pointer: > > static int snd_pcm_alloc_vmalloc_buffer(struct snd_pcm_substream *subs, size_t size) > { > struct snd_pcm_runtime *runtime = subs->runtime; > if (runtime->dma_addr) { > if (runtime->dma_bytes >= size) > return 0; /* already large enough */ > vfree((void *)runtime->dma_addr); > } > runtime->dma_addr = (unsigned long)vmalloc(size + 12 * 1024); > if (!runtime->dma_addr) > return -ENOMEM; > runtime->dma_area = (void *)ALIGN(runtime->dma_addr, 16 * 1024); > runtime->dma_bytes = size; > return 0; > } > > Of course it cannot be regarded as anything more as a hack. And > definitely not as a solution... So my question is: > > Any idea how the "proper solution" should look like? I see no obvious > method to get an arch-independent "mmap-compliant" buffer. This problem > was found on a sh arch, using an USB Audio Class device, however out > there some other architectures are suffering from this MMU "feature" as > well (ie. see http://docs.hp.com/en/B3906-90006/ch07s09.html) and > possibly other drivers could behave "wrongly" (quoted as the driver is > actually innocent...) Currently, it's PITA on Linux to get a non-cached mmap properly in a generic way. Especially together with coherent memory allocations, there is no API for that. But, in the case of usb-audio, the problem is the vmaloc'ed intermediate buffer, and this could be a bit easier. One thing we can try is a patch like below. But, I'm not sure whether this is correct over all architectures, too. At best, a generic API would be helpful for such a thing... thanks, Takashi diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 9cf81bc..cecf4b4 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -3136,6 +3136,7 @@ static int snd_pcm_default_mmap(struct snd_pcm_substream *substream, struct vm_area_struct *area) { area->vm_ops = &snd_pcm_vm_ops_data; + area->vm_page_prot = pgprot_noncached(area->vm_page_prot); area->vm_private_data = substream; area->vm_flags |= VM_RESERVED; atomic_inc(&substream->mmap_count); _______________________________________________ Alsa-devel mailing list Alsa-devel@xxxxxxxxxxxxxxxx http://mailman.alsa-project.org/mailman/listinfo/alsa-devel