On Fri, Aug 22, 2014 at 04:15:14PM +0100, Daniel P. Berrange wrote: > On Fri, Aug 22, 2014 at 10:56:47AM -0400, John Ferlan wrote: > > > > > > On 08/22/2014 10:46 AM, Daniel P. Berrange wrote: > > > On Mon, Aug 11, 2014 at 04:30:19PM -0400, John Ferlan wrote: > > >> Currently the safezero() function uses build conditionals to choose either > > >> the posix_fallocate() or mmap() with a fallback to safewrite() in order to > > >> preallocate a file. > > >> > > >> This patch will modify the logic in order to allow fallbacks in the > > >> event that posix_fallocate() or the ftruncate()and mmap() doesn't work > > >> properly. The fallback will be to use the slow safewrite of zero filled > > >> buffers to the file. > > > > > > Have you actually encountered failing of posix_fallocate() in the > > > real world ? It is supposed to automatically fallback to the > > > equivalent of writing zeros if the filesystem / kernel does not > > > support it, so we should not have todo runtime fallback ourselves. > > > The existance of fallback is the main distinction between the > > > posix_fallocate() and fallocate() system calls. > > > > > > > It wasn't so much as a "failure" as "unexpected results" - the key being > > that the resulting created (or resized) file was not sized as expected. > > > > For an NFS target the results are not what was expected. I've left some > > history in the prior set of patches with the following probably having > > the most details: > > > > http://www.redhat.com/archives/libvir-list/2014-August/msg00367.html > > So, IIUC, the bug happens when the rsize mount option to NFS is not 4k. > > strace'ing libvirtd on an NFS volume in this case shows: > > open("/var/lib/libvirt/images/lettuce/foo", O_RDWR|O_CREAT|O_EXCL, 0600) = 24 > fstat(24, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 > ftruncate(24, 1073741824) = 0 > fallocate(24, 0, 0, 1073741824) = -1 EOPNOTSUPP (Operation not supported) > fallocate(24, 0, 0, 1073741824) = -1 EOPNOTSUPP (Operation not supported) > fstat(24, {st_mode=S_IFREG|0600, st_size=1073741824, ...}) = 0 > fstatfs(24, {f_type="NFS_SUPER_MAGIC", f_bsize=1048576, f_blocks=118342, f_bfree=71002, f_bavail=65632, f_files=7678560, f_ffree=5495931, f_fsid={0, 0}, f_namelen=255, f_frsize=1048576}) = 0 > pread(24, "\0", 1, 1048575) = 1 > pwrite(24, "\0", 1, 1048575) = 1 > pread(24, "\0", 1, 2097151) = 1 > pwrite(24, "\0", 1, 2097151) = 1 > pread(24, "\0", 1, 3145727) = 1 > > > So we can see glibc here trying fallocate() and then falling back to > writing zeros. Since the volume does not come out at the right size > this seems to show a bug in glibc. > > So I think we really ought to report that bug to glibc to be fixed > there rather than working around it in libvirt, as there are many > more applications besides libvirt that will be impacted by this > bug. Opps, meant to include the stack trace to show where the pread/writes are coming from: (gdb) bt #0 pread64 () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f55a29f9c5e in internal_fallocate (fd=fd@entry=24, offset=1048575, len=1072693248) at ../sysdeps/posix/posix_fallocate.c:78 #2 0x00007f55a29f9cc7 in posix_fallocate (fd=fd@entry=24, offset=<optimized out>, len=<optimized out>) at ../sysdeps/unix/sysv/linux/wordsize-64/posix_fallocate.c:62 #3 0x00007f55a6071026 in safezero (fd=fd@entry=24, offset=<optimized out>, len=<optimized out>) at util/virfile.c:1031 #4 0x00007f55916258c2 in createRawFile (inputvol=0x0, vol=0x7f5570008280, fd=24) at storage/storage_backend.c:389 #5 virStorageBackendCreateRaw (conn=<optimized out>, pool=<optimized out>, vol=0x7f5570008280, inputvol=0x0, flags=<optimized out>) at storage/storage_backend.c:450 Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list