* Christoph Hellwig: > On Tue, Jul 30, 2024 at 07:03:50PM +0200, Florian Weimer wrote: >> > The only relevant exception is probably ext4 in ext2/ext3 mode, where >> > the latter might still have users left running real workloads on it >> > and not using it for usb disks or VM images. >> >> Why doesn't the kernel perform allocation in these cases? There doesn't >> seem to be a file-system-specific reason why it's impossible to do. > > Because in general it's a really stupid idea. You don't get a better > allocation patter, but you are writing every block twice, making things > significantly slower and wearing the device out in the process if it > is flash based. I would assume the applications that do pre-allocation before mmap with random writes had a good reason to do it even when it was slow. >> At the very least, we should have a variant of ftruncate that never >> truncates, likely under the fallocate umbrella. It seems that that's >> how posix_fallocate is used sometimes, for avoiding SIGBUS with mmap. >> To these use cases, whether extents are allocated or not does not >> matter. > > I don't see how that is related. Open file, posix_fallocate to the desired size, then use mmap, seems to be somewhat common. More often, people use fruncate, but that can unexpectedly shrink the file. >> If we removed the fallback code from glibc today, it would just be >> EOPNOTSUPP that leaks to applications, so it's structurally the same >> issue. > > Not really. EOPNOTSUPP is a valid error code, that has historically > been returned by other operating systems and even other libc > implementations for Linux I don't see EOPNOTSUPP handling code in Ceph, Beanstalk, Bitcoin Core, or Transmission. Most of them seem to just ignore errors (except perhaps Ceph). This might not be a problem in the end, but it seems that existing software (even portable software) does not check for EOPNOTSUPP. Thanks, Florian