> > Thanks for the update. I think the alignment check should be based on > > the following note in copy_user_nocache. > > > > * Note: Cached memory copy is used when destination or size is not > > * naturally aligned. That is: > > * - Require 8-byte alignment when size is 8 bytes or larger. > > * - Require 4-byte alignment when size is 4 bytes. > > > > So, I think the code may be something like this. I also made the following > changes: > > Thanks! > > > - Mask with 7, not 8. > > Yes, good catch. > > > - ALIGN with cacheline size, instead of 8. > > - Add (bytes > flushed) test since calculation with unsigned long still results > in a negative > > value (as a positive value). > > > > if (bytes < 8) { > > if ((dest & 3) || (bytes != 4)) > > arch_wb_cache_pmem(addr, 1); > > } else { > > if (dest & 7) { > > dest = ALIGN(dest, boot_cpu_data.x86_clflush_size); > > Why align the destination to the next cacheline? As far as I can see > the ALIGN_DESTINATION macro in arch/x86/include/asm/asm.h only aligns > to the next 8-byte boundary. The clflush here flushes for the cacheline size. So, we do not need to flush the same cacheline again when the unaligned tail is in the same line. Thanks, -Toshi