On Wed, Feb 24, 2016 at 12:15:34AM +0200, Boaz Harrosh wrote: > On 02/23/2016 11:47 PM, Dave Chinner wrote: > <> > > > > i.e. what we've implemented right now is a basic, slow, > > easy-to-make-work-correctly brute force solution. That doesn't mean > > we always need to implement it this way, or that we are bound by the > > way dax_clear_sectors() currently flushes cachelines before it > > returns. It's just a simple implementation that provides the > > ordering the *filesystem requires* to provide the correct data > > integrity semantics to userspace. > > > > Or it can be written properly with movnt instructions and be even > faster the a simple memset, and no need for any cl_flushing let alone > any radix-tree locking. Precisely my point - semantics of persistent memory durability are going to change from kernel to kernel, architecture to architecture, and hardware to hardware. Assuming applications are going to handle all these wacky differences to provide their users with robust data integrity is a recipe for disaster. If applications writers can't even use fsync properly, I can guarantee you they are going to completely fuck up data integrity when targeting pmem. > That said your suggestion above is 25%-100% slower than current code > because the cl_flushes will be needed eventually, and the atomics of a > lock takes 25% the time of a full page copy. So what? We can optimise for performance later, once we've provided correct and resilient infrastructure. We've been fighting against premature optimisation for performance from teh start with DAX - we've repeatedly had to undo stuff that was fast but broken, and were not doing that any more. Correctness comes first, then we can address the performance issues via iterative improvement, like we do with everything else. > You are forgetting we are > talking about memory and not harddisk. the rules are different. That's bullshit, Boaz. I'm sick and tired of people saying "but pmem is different" as justification for not providing correct, reliable data integrity behaviour. Filesytems on PMEM have to follow all the same rules as any other type of persistent storage we put filesystems on. Yes, the speed of the storage may expose the fact that am unoptimised correct implementation is a lot more expensive than ignoring correctness, but that does not mean we can ignore correctness. Nor does it mean that a correct implementation will be slow - it just means we haven't optimised for speed yet because getting it correct is a hard problem and our primary focus. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>