On Apr 28, 2021, at 11:33 PM, Mike Frysinger <vapier@xxxxxxxxxx> wrote: > > i started running e4defrag out of curiosity on some large files that i'm > archiving long term. its results seem exceedingly optimistic and i have > a hard time agreeing with it. am i pessimistic ? > > for example, i have a ~4GB archive: > $ e4defrag -c ./foo.tar.xz > <File> now/best size/ext > ./foo.tar.xz > 39442/2 93 KB > > Total/best extents 39442/2 > Average size per extent 93 KB > Fragmentation score 34 > [0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag] > This file (./foo.tar.xz) does not need defragmentation. > Done. > > i have a real hard time seeing this file as barely "a little bit fragmented". > shouldn't the fragmentation score be higher ? I would tend to agree. A 4GB file with 39k 100KB extents is not great. On an HDD with 125 IOPS (not counting track buffers and such) this would take about 300s to read at a whopping 13MB/s. On flash, small writes do lead to increased wear, but the seeks are free and you may not care. IMHO, anything below 1MB/extent is sub-optimal in terms of IO performance, and a sign of filesystem fragmentation (or a very poor IO pattern), since mballoc should try to do allocation in 8MB chunks for large writes. In many respects, if the extents are large enough, the "cost" of a seek hidden by the device bandwidth (e.g. 250 MB/s / 125 seeks/sec = 2MB for a good HDD today, scale linearly for RAID-5/6), so any extent larger than this is not limited by seeks. Should 1024 x 4MB extents in a 4GB file be considered fragmented or not? Definitely 108KB/extent should be. However, the "ideal = 2" case is bogus, since extents are max size 128MB, so you would need at least 32 for a perfect 4GB file. In that respect, e4defrag is at best a "working prototype" but I don't think many people use it, and has not gotten many improvements since it was first landed. If you have a better idea for a "fragmentation score" I would be open to looking at it, doubly so if it comes in the form of a patch. You could check the actual file layout using "fallocate -v" before/after running e4defrag to see how the allocation was changed. This would tell you if it is actually helping or not. I've thought for a while that it would be useful to add the same "fragmentation score" to filefrag, but that would be contingent on the score actually making sense. You can also use "e2freefrag" to check the filesystem as a whole to see whether the free space is badly fragmented (i.e. most free chunks < 8MB). In that case, running e4defrag _may_ help you, but it is not "smart" like the old DOS defrag utilities, since it just rewrites each file separately instead of having a "plan" for how to defrag the whole filesystem. > as a measure of "how fragmented is it really", if i copy the file and then > delete the original, there's a noticeable delay before `rm` finishes. Yes, that would be totally clear if you ran filefrag on the file first. Cheers, Andreas
Attachment:
signature.asc
Description: Message signed with OpenPGP