* Steven Whitehouse: > On 27/06/2020 11:00, Florian Weimer wrote: >> * Josef Bacik: >> >>> As for your ENOSPC issue, I've made improvements on that area. I >>> see this in production as well, I have monitoring in place to deal >>> with the machine before it gets to this point. That being said if >>> you run the box out of metadata space things get tricky to fix. >>> I've been working my way down the list of issues in this area for >>> years, this last go around of patches I sent were in these corner >>> cases. >> Is there anything we need to do in userspace to improve the behavior >> of fflush and similar interfaces? >> >> This is not strictly a btrfs issue: Some of us are worried about >> scenarios where the write system call succeeds and the data never >> makes it to storage *without a catastrophic failure*. (I do not >> consider running out of disk space a catastrophic failure.) NFS >> apparently has this property, and you have to call fsync or close the >> descriptor to detect this. fsync is not desirable due to its >> performance impact. > > It doesn't matter which filesystem you use, you can't be sure that the > data is really safe on disk without calling fsync. In the case of a > new inode, that means fsync on the file and on the containing > directory. In my opinion, there is a conceptual difference between the machine or storage crashing hard, and just running out of disk space. > There can be performance issues depending on how that is done, however > there are a number of solutions to those issues which can reduce the > performance effects to the point where they are usually no longer a > problem. That is with the caveat that slow storage will always be > slow, of course! > > The usual tricks are to avoid doing lots of small fsyncs, by gathering > up smaller files, ideally sorting them into inode number order for > local filesystems, and then issuing fsyncs asynchronously, waiting for > them all only once all the fsyncs have been issued. Also > fadvise/madvise can be useful in these situations too, None of this applies to shell utilities such as grep and cat. They work around data loss as a result of the write system call not reporting ENOSPC errors: they close stdout and stderr underneath glibc, which leads to a different class of problems. It turns out that on Linux, close does more space checks than write, so this allows the shell utilities to check for ENOSPC without issuing fsyncs. At present, lack of space checks from write seems to primarily happen with NFS. So let me rephrase: Does btrfs report ENOSPC during write? If it does not, what can we do to check for sufficient space during fflush and similar operations? If we change the shell utilities to do an fsync on close, we get traditional UNIX behavior with traditional UNIX performance. I don't think that's what people want. Thanks, Florian _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx