On Mon, Oct 29, 2012 at 03:37:05AM -0700, Andi Kleen wrote: > > Agreed, if we're going to add an xattr, then we might as well store > > I don't think an xattr makes sense for this. It's sufficient to keep > this state in memory. > > In general these error paths are hard to test and it's important > to keep them as simple as possible. Doing IO and other complexities > just doesn't make sense. Just have the simplest possible path > that can do the job. It's actually pretty easy to test this particular one, and certainly one of the things I'd strongly encourage in this patch series is the introduction of an interface via madvise and fadvise that allows us to simulate an ECC hard error event. So I don't think "it's hard to test" is a reason not to do the right thing. Let's make it easy to test, and the include it in xfstests. All of the core file systems are regularly running regression tests via xfstests, so if we define a standard way of testing this function (this is why I suggested fadvise/madvise instead of an ioctl, but in a pinch we could do this via an ioctl instead). Note that the problem that we're dealing with is buffered writes; so it's quite possible that the process which wrote the file, thus dirtying the page cache, has already exited; so there's no way we can guarantee we can inform the process which wrote the file via a signal or a error code return. It's also possible that part of the file has been written out to the disk, so forcibly crashing the system and rebooting isn't necessarily going to save the file from being corrupted. Also, if you're going to keep this state in memory, what happens if the inode gets pushed out of memory? Do we pin the inode? Or do we just say that it's stored into memory until some non-deterministic time (as far as userspace programs are concerned, but if they are running in a tight memory cgroup, it might be very short time later) suddenly the state gets lost? I think it's fair that if there are file systems that don't have a single bit they can allocate in the inode, we can either accept Jun'ichi's suggest to just forcibly crash the system, or we can allow the state to be stored in memory. But I suspect the core file systems that might be used by enterprise-class workloads will want to provide something better. I'm not that convinced that we need to insert an xattr; after all, not all file systems support xattrs at all, and I think a single bit indicating that the file has corrupted data is sufficient. But I think it would be useful to at least optionally support a persistent storage of this bit. - Ted -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>