On January 11, 2016 3:28:01 AM PST, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote: >On Sat, Jan 09, 2016 at 02:36:03PM -0800, Andy Lutomirski wrote: >> On Sat, Jan 9, 2016 at 12:01 AM, Chris Wilson ><chris@xxxxxxxxxxxxxxxxxx> wrote: >> > On Thu, Jan 07, 2016 at 02:32:23PM -0800, H. Peter Anvin wrote: >> >> On 01/07/16 14:29, H. Peter Anvin wrote: >> >> > >> >> > I would be very interested in knowing if replacing the final >clflushopt >> >> > with a clflush would resolve your problems (in which case the >last mb() >> >> > shouldn't be necessary either.) >> >> > >> >> >> >> Nevermind. CLFLUSH is not ordered with regards to CLFLUSHOPT to >the >> >> same cache line. >> >> >> >> Could you add a sync_cpu(); call to the end (can replace the final >mb()) >> >> and see if that helps your case? >> > >> > s/sync_cpu()/sync_core()/ >> > >> > No. I still see failures on Baytrail and Braswell (Pineview is not >> > affected) with the final mb() replaced with sync_core(). I can >reproduce >> > failures on Pineview by tweaking the clflush_cache_range() >parameters, >> > so I am fairly confident that it is validating the current code. >> > >> > iirc sync_core() is cpuid, a heavy serialising instruction, an >> > alternative to mfence. Is there anything that else I can infer >about >> > the nature of my bug from this result? >> >> No clue, but I don't know much about the underlying architecture. >> >> Can you try clflush_cache_ranging one cacheline less and then >manually >> doing clflushopt; mb on the last cache line, just to make sure that >> the helper is really doing the right thing? You could also try >> clflush instead of clflushopt to see if that makes a difference. > >I had looked at increasing the range over which clflush_cache_range() >runs (using roundup/rounddown by cache lines), but it took something >like +/- 256 bytes to pass all the tests. And also did >s/clflushopt/clflush/ to confirm that made no differnce. > >Bizarrely, > >diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c >index 6000ad7..cf074400 100644 >--- a/arch/x86/mm/pageattr.c >+++ b/arch/x86/mm/pageattr.c >@@ -141,6 +141,7 @@ void clflush_cache_range(void *vaddr, unsigned int >size) > for (; p < vend; p += clflush_size) > clflushopt(p); > >+ clflushopt(vend-1); > mb(); > } > EXPORT_SYMBOL_GPL(clflush_cache_range); > >works like a charm. >-Chris That clflushopt touches a cache line already touched and therefore serializes with it. -- Sent from my Android device with K-9 Mail. Please excuse brevity and formatting. _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel