On Mon, Dec 21, 2020 at 4:04 AM Theodore Y. Ts'o <tytso@xxxxxxx> wrote: > > So that implies that your experiment may not be repeatable; did you > make sure the file system was freshly reformatted before you wrote out > the files in the directory you are deleting? And was the directory > written out in exactly the same way? And did you make sure all of the > writes were flushed out to disk before you tried timing the "rm -rf" > command? And did you make sure that there weren't any other processes > running that might be issuing other file system operations (either > data or metadata heavy) that might be interfering with the "rm -rf" > operation? What kind of storage device were you using? (An SSD; a > USB thumb drive; some kind of Cloud emulated block device?) > I got another machine with a faster NVME disk. I discarded the whole drive before partitioning it, this drive is very fast in discarding blocks: # time blkdiscard -f /dev/nvme0n1p1 real 0m1.356s user 0m0.003s sys 0m0.000s Also, the drive is pretty big compared to the dataset size, so it's unlikely to be fragmented: # lsblk /dev/nvme0n1 NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT nvme0n1 259:0 0 1.7T 0 disk └─nvme0n1p1 259:1 0 1.7T 0 part /media # df -h /media Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1p1 1.8T 1.2G 1.7T 1% /media # du -sh /media/linux-5.10/ 1.1G /media/linux-5.10/ I'm issuing sync + sleep(10) after the extraction, so the writes should all be flushed. Also, I repeated the test three times, with very similar results: # dmesg |grep EXT4-fs [12807.847559] EXT4-fs (nvme0n1p1): mounted filesystem with ordered data mode. Opts: data=ordered,discard # tar xf ~/linux-5.10.tar ; sync ; sleep 10 # time rm -rf linux-5.10/ real 0m1.607s user 0m0.048s sys 0m1.559s # tar xf ~/linux-5.10.tar ; sync ; sleep 10 # time rm -rf linux-5.10/ real 0m1.634s user 0m0.080s sys 0m1.553s # tar xf ~/linux-5.10.tar ; sync ; sleep 10 # time rm -rf linux-5.10/ real 0m1.604s user 0m0.052s sys 0m1.552s # dmesg |grep EXT4-fs [13133.953978] EXT4-fs (nvme0n1p1): mounted filesystem with writeback data mode. Opts: data=writeback,discard # tar xf ~/linux-5.10.tar ; sync ; sleep 10 # time rm -rf linux-5.10/ real 1m29.443s user 0m0.073s sys 0m2.520s # tar xf ~/linux-5.10.tar ; sync ; sleep 10 # time rm -rf linux-5.10/ real 1m29.409s user 0m0.081s sys 0m2.518s # tar xf ~/linux-5.10.tar ; sync ; sleep 10 # time rm -rf linux-5.10/ real 1m19.283s user 0m0.068s sys 0m2.505s > Note that benchmarking the file system operations is *hard*. When I > worked with a graduate student working on a paper describing a > prototype of a file system enhancement to ext4 to optimize ext4 for > drive-managed SMR drives[1], the graduate student spent *way* more > time getting reliable, repeatable benchmarks than making changes to > ext4 for the prototype. (It turns out the SMR GC operations caused > variations in write speeds, which meant the writeback throughput > measurements would fluctuate wildly, which then influenced the > writeback cache ratio, which in turn massively influenced the how > aggressively the writeback threads would behave, which in turn > massively influenced the filebench and postmark numbers.) > > [1] https://www.usenix.org/conference/fast17/technical-sessions/presentation/aghayev > Interesting! Cheers, -- per aspera ad upstream