On Fri, Dec 09, 2016 at 01:28:05PM -0700, Andreas Dilger wrote: > On Dec 8, 2016, at 6:25 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > > On Wed, Dec 07, 2016 at 07:34:17PM +0100, Sven Joachim wrote: > >> On 2016-12-07 11:16 -0700, Andreas Dilger wrote: > >> > >>> Add debian-dpkg mailing list to CC. > >>> > >>> On Dec 7, 2016, at 10:58 AM, Andreas Dilger <adilger@xxxxxxxxx> wrote: > >>>> > >>>> On Dec 7, 2016, at 2:52 AM, Renaud Mariana <rmariana@xxxxxxxxxx> wrote: > >>>>> > >>>>> Here are my answers, hope it will help solve this issue, thanks. > >>>>> > >>>>> Recap: > >>>>> dpkg kibana on ext4 over a nbd device takes 10 minutes > >>>>> with xfs it's only 30s. > >>>>> with ext4 no extends only 30s. > >>>>> > >>>>> > >>>>> kernels : > >>>>> 4.5.7 has this issue as older kernel like 4.4.34 > >>>>> The issue is also when nbd client & server run on same host > >>>>> > >>>>> > >>>>> How small are the files? > >>>>> here is the histogram of file sizes : http://pasteboard.co/6HC3nKyk2.png > >>>>> We can see 5000 files around 512 Bytes. > >>>> > >>>> Definitely there is no value to use fallocate for 512-byte files, or any > >>>> of the files that can be written in a single write() syscall. I'd expect > >>>> any reasonable tool to be using a write buffer of at least 2-4MB these > >>>> days to get good performance, so writes below the buffer size shouldn't > >>>> use fallocate() at all. > >> > >> It should be noted that the latest dpkg (1.18.15) only uses fallocate > >> for files which are at least 16 KiB in size[1], so it would be nice if > >> Renaud could recheck with that version, or cherry-pick the patch into > >> whatever version he uses. > > > > The fallocate() call should be removed completely. Applications > > should not be attempting to control file allocation like this as it > > defeats all the optimisations that filesystems use to optimise IO > > patterns and minimise filesystem fragmentation (e.g. delayed > > allocation). > > > > There is /rarely/ a need for applications to use fallocate() to > > manage fragmentation - especailly as excessive use of fallocate() > > actively harms performance and accelerates filesystem aging. > > > > Unless an application has a specific, repeatable performance problem > > due to file fragmentation, it should not be using fallocate() to > > allocate file space. > > I'm not sure I'd go so far as to say that fallocate() should be removed > completely. Isn't that the best (only) way for an application to tell > the filesystem that it is about to write a file of X size That's most definitely not what preallocation is for. Filesystems optimise the "growing file via sequential writes at EOF" case just fine - using fallocate for this sort of thing is simply defeats all the writeback optimisations and improvements we've developed over the past 20 years for this /very common/ workload... > and try to > find a suitable amount of free space for it? fallocate() does give a guarantee than a subsequent write won't ENOSPC, but "suitable" is very dependent on context. This contenxt is something applications don't have - they have no idea what allocation optimisations are required to provide fast, efficient IO, and have no idea that different filesystems will require /different optimisations/. e.g. btrfs will probably also suffer horribly under fallocate usage like what dpkg is doing, and I can tell you for certain it will make a mess of XFS filesystems, too.... > Otherwise, if the file > is large and/or written slowly and/or the system has memory pressure > the filesystem (even with delalloc) can't make a good decision about > allocation. None of which are the case for dpkg. Nor is it the case for /most applications/. And fallocate actually makes memory pressure problems worse, because it defeats writeback optimisations to maximise dirty page cleaning rates... Preallocation is *not a general purpose tool*. It's for applications that have performance problems caused by known, repeatable fragmentation or file layout issue. > However, fallocate() won't really help if the file size > is small (e.g. a few MB) since that can easily fit into RAM and will > be written to disk in a single chunk. In my experience, the list of "where fallocate is harmful" is quite a bit larger than the list of "where fallocate is beneficial". This is just one example of where it's harmful. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html