>> ... and the non-temporal version is the optimal one even though we're >> defaulting to copy_user_enhanced_fast_string for memcpy on modern Intel >> CPUs...? My current generation cpu has a bit of an issue with recovering from a machine check in a "rep mov" ... so I'm working with a version of memcpy that unrolls into individual mov instructions for now. > At least the pmem driver use case does not want caching of the > source-buffer since that is the raw "disk" media. I.e. in > pmem_do_bvec() we'd use this to implement memcpy_from_pmem(). > However, caching the destination-buffer may prove beneficial since > that data is likely to be consumed immediately by the thread that > submitted the i/o. I can drop the "nti" from the destination moves. Does "nti" work on the load from source address side to avoid cache allocation? On another topic raised by Boris ... is there some CONFIG_PMEM* that I should use as a dependency to enable all this? -Tony ��.n������g����a����&ޖ)���)��h���&������梷�����Ǟ�m������)������^�����������v���O��zf������