Hi, On upgrading from centos 7.6 to centos 8.2 mkfs slowed down by orders of magnitude. e.g. 35GB partition from under 8s to 4m+ on the same host. Most time is spent on writing the journal to the disk. strace shows the following: We have got strace which shows that each each block is zeroed with fallocate and each invocation of fallocate takes 10ms, this accumulates of course. We have found that using UNIX_IO_NOZEROOUT=1 to affect libext2fs Brings the timings back in line down to seconds. If this is not a known bug I can send more details, Looks that calling fallocate for each block is very inefficient on some system. In our case this is dellr640 (skylake) with a mechanical disk. Kind Regards, Maciej On Mon, 10 Aug 2020 at 13:35, Maciej Jablonski <mafjmafj@xxxxxxxxx> wrote: > > Hi, > > On upgrading from centos 7.6 to centos 8.2 mkfs slowed down by orders of magnitude. > > e.g. 35GB partition from under 8s to 4m+ on the same host. > > Most time is spent on writing the journal to the disk. > > strace shows the following: > > 16:19:49.827056 prctl(PR_GET_DUMPABLE) = 1 (SUID_DUMP_USER) > 16:19:49.827112 fallocate(3, FALLOC_FL_ZERO_RANGE, 3383296, 4096) = 0 > 16:19:49.835203 pwrite64(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 3362816) = 4096 > 16:19:49.835321 getuid() = 0 > 16:19:49.835403 geteuid() = 0 > 16:19:49.835463 getgid() = 0 > 16:19:49.835513 getegid() = 0 > 16:19:49.835582 prctl(PR_GET_DUMPABLE) = 1 (SUID_DUMP_USER) > 16:19:49.835657 fallocate(3, FALLOC_FL_ZERO_RANGE, 3387392, 4096) = 0 > 16:19:49.843471 pwrite64(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 3366912) = 4096 > 16:19:49.843562 getuid() = 0 > 16:19:49.843619 geteuid() = 0 > 16:19:49.843669 getgid() = 0 > 16:19:49.843715 getegid() = 0 > 16:19:49.843785 prctl(PR_GET_DUMPABLE) = 1 (SUID_DUMP_USER) > 16:19:49.843836 fallocate(3, FALLOC_FL_ZERO_RANGE, 3391488, 4096) = 0 > 16:19:49.851885 pwrite64(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 3371008) = 4096 > > > Each invocation of fallocate takes 10ms, this accumulates of course. > We have found that using > > UNIX_IO_NOZEROOUT=1 to affect libext2fs > > Brings the timings back in line down to seconds. > > If this is not a known bug I can send more details, > > Looks that calling fallocate for each block is very inefficient on some system. > In our case this is dellr640 (skylake) with a mechanical disk. > > Kind Regards, > > Maciej > >