On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote: > Hi, > > On 2018-04-11 15:52:44 -0600, Andreas Dilger wrote: > > On Apr 10, 2018, at 4:07 PM, Andres Freund <andres@xxxxxxxxxxx> wrote: > > > 2018-04-10 18:43:56 Ted wrote: > > >> So for better or for worse, there has not been as much investment in > > >> buffered I/O and data robustness in the face of exception handling of > > >> storage devices. > > > > > > That's a bit of a cop out. It's not just databases that care. Even more > > > basic tools like SCM, package managers and editors care whether they can > > > proper responses back from fsync that imply things actually were synced. > > > > Sure, but it is mostly PG that is doing (IMHO) crazy things like writing > > to thousands(?) of files, closing the file descriptors, then expecting > > fsync() on a newly-opened fd to return a historical error. > > It's not just postgres. dpkg (underlying apt, on debian derived distros) > to take an example I just randomly guessed, does too: > /* We want to guarantee the extracted files are on the disk, so that the > * subsequent renames to the info database do not end up with old or zero > * length files in case of a system crash. As neither dpkg-deb nor tar do > * explicit fsync()s, we have to do them here. > * XXX: This could be avoided by switching to an internal tar extractor. */ > dir_sync_contents(cidir); > > (a bunch of other places too) > > Especially on ext3 but also on newer filesystems it's performancewise > entirely infeasible to fsync() every single file individually - the > performance becomes entirely attrocious if you do that. Is that still true if you're able to use some kind of parallelism? (async io, or fsync from multiple processes?) --b.