On Wed, Jul 9, 2008 at 6:30 PM, Shehjar Tikoo <shehjart@xxxxxxxxxxxxxxx> wrote: > Neil Brown wrote: >> >> On Wednesday July 9, shehjart@xxxxxxxxxxxxxxx wrote: >>> >>> Neil Brown wrote: >>>> >>>> So what exactly is this new export option that you want to add? >>> >>> As the option's name suggests, the idea is to use fallocate support in >>> ext4 and XFS, to pre-allocate disk blocks. I feel this might help nfsd sync >>> writes where each write request has to go to disk almost ASAP. Because NFSv3 >>> writes have to be stable(..not sure about NFSv4..), the write-to-disk and >>> block allocation must happen immediately. It is possible that the blocks >>> being allocated for each NFS sync write are not as contiguous as they could >>> be for say, local buffered writes. >>> I am hoping that by using some form of adaptive pre-allocation we can >>> improve the contiguity of disk blocks for nfsd writes. >>> >> >> NFSv3 writes do not have to be stable. The client will usually >> request DATA_UNSTABLE, and then send a COMMIT a while later. This >> should give the filesystem time to do delayed allocation. >> NFSv4 is much the same. >> NFSv2 does require stable writes, but it should not be used by anyone >> interested in good write performance on large files. >> >> It isn't clear to me that this is something that should be an option >> in /etc/exports. > > For now, I only need this option so I dont have to rebuild the kernel each > time I want to toggle the "prealloc" option. > >> When would a sysadmin want to turn it off? Or if a sysadmin did want >> control, sure the level of control required would be the size of the >> preallocation. > > It might be a good idea to turn it off if the block allocation algorithm > slows things down when allocating large number of blocks. > > True. If needed, we should be able to add entries in /proc that control min, > max and other limits on preallocation size. Usually options specific to a particular physical file system are handled with mount options on the server. NFS export options are used to tune NFS-specific behavior. Couldn't you specify a mount option that enables preallocation when mounting the file system you want to export? I can see having a file system callback for the NFS server that provides a hint that "the client just extended this file and wrote a bunch of data -- so preallocate blocks for the data, and I will commit the data at some later point". Most file systems would make this a no-op. But I don't think this would help small synchronous writes... it would improve block allocation for large writes. -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html