Hi Gionatan, On Thu, Jun 15, 2017 at 05:04:48PM +0200, Gionatan Danti wrote: > On 15/06/2017 16:10, Carlos Maiolino wrote: > > > > Disregard this comment, I messed up with some tests, so, basically, the > > application is responsible for the user data, and need to use fsync/fdatasync to > > ensure the data is properly written, this is not FS responsibility. > > > > cheers > > Hi Carlos, > I fully agree that it is application responsibility to issue appropriate > fsync(). However, knowing that this not always happens in real-world, I am > trying to be as much "fail-safe" as possible. > Yeah, unfortunately, the real-world has lots of bad written applications :( > From my understanding of your previous message, a full thin pool with > --errorwhenfull=y should return ENOSPC to the filesystem. Does this work on > normal cached/buffered/async writes, or with O_DIRECT writes only? > AFAIK, it will return ENOSPC with O_DIRECT, yes. With async writes, you won't have any error returned until you issue a fsync/fdatasync, which, per my understanding, it will return an EIO. > If it is not the case, how can I prevent further writes to a data-full thin > pool? With ext4, I can use "data=journal,errors=remount-ro" to catch any > write errors and stop the filesystem (or remount it read-only), losing only > some seconds worth of data. This *will* works even for applications that do > not issue fsync(), as the read-only filesystem will not let the write() > syscall to complete successfully. > It 'works' on Ext4, because it will journal the data first, and at some point it will try to allocate blocks for metadata, and that will fail, which will help ext4 to catch this corner case, although, IIRC, 'data=journal' mode isn't supported at all. I even heard rumors of the possibility to have this option removed from Ext4, but I don't follow ext4 development close enough to tell you if this is just a rumor or they are really considering it. > On XFS (which I would *really* use, because it is quite more advanced), all > writes directed to a full thin-pool will basically end on /dev/null and, as > write() succeeded, the application/user will *not* be alerted on any way. If > the thin-pool can communicate its "end of free space" to the filesystem, the > problem can be avoided. > The application won't be alerted in any way unless it uses fsync()/fdatasync() with any filesystem being used, even using data=journal in ext4, this won't happen, ext4 gets mounted as read-only because there were 'metadata' errors when writing the file to the journal, but again, it is not a fix for a faulty application, it is not even reliable for shutting down the filesystem the way you are thinking this will. It will only shut down the filesystem depending on the amount of blocks being allocated, even when using data=journal, if the amount of blocks allocated are enough to hold the metadata, but not the data, you will see the same problem as you are seeing with XFS (or ext4 without data=journal), so, don't rely on it. > If this can not be done, the only remaining possibility is to instruct the > filesystem to stop itself on data writeout errors. So, we got full-circle > about my original question: how can I stop XFS when writes return I/O > errors? Please note that I tried to set any > /sys/fs/xfs/dm-8/error/metadata/*/max_retries tunable to 0, but I can not > get the filesystem to suspend itself, even when dmesg reported metadata > write errors. Yes, these options won't help, because they are configuration options for metadata errors, not data errors. Please, bear in mind that your question should be: "how can I stop a filesystem when async writes return I/O errors", because this isn't a XFS issue. BUt again, there isn't too much you can do here, async writes are supposed to behave this way. And whoever is writing "data" to the device is supposed to care of their own data. Imagine for example a situation where you have 2 applications using the same filesystem (quite common right?), then application A and B issues buffered writes, and for some reason, application A data, hits an IO error, for any reason, maybe a too busy storage, a missed scsi command, whatever, anything that can be retried. then the filesystem shuts down because of that, which will also affect application B, even if nothing wrong happened with application B. One of the goals of multitasking is having applications running at the same time without affecting each other. Now, consider that, application B is a well written application, and application A isn't. App B cares for its data to be written to disk, while app A doesn't. In case of a casual error, app B will retry to write its data, while app A won't. Should we really shutdown the filesystem here affecting everything on the system, because application A is not caring for its own data? Shutting a filesystem down, has basically one purpose: avoid corruption, we basically only shutdown a filesystem when keeping it alive can cause a problem with everything using it (really really simple explanation here). Surely this can be improved, but at the end, the application will always need to check for its own data. I am not really a device-mapper developer and I don't know much about its code in depth. But, I know it will issue warnings when there isn't more space left, and you can configure a watermark too, to warn the admin when the space used reaches that watermark. By now, I believe the best solution is to have a reasonable watermark set on the thin device, and the Admin take the appropriate action whenever this watermark is achieved. Cheers. > > Thank you very much. > -- Carlos -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html