Parallel FSCK Project current status Written by harshads@ and further updated by tytso@ Background ========== Ext4 fsck has traditionally been a single threaded program. On large (and especially fragmented) disks, fsck has resulted in performance degradation. On large disks, this single threaded fsck takes a long time to complete. Fortunately, upstream has seen some action for parallelizing fsck [1]. However, as you can see the patchset is very long (with around 50~ patches) and it didn’t completely make it through to e2fsck. Ted added threading support to e2fsprogs [3] that added following features: * The patchset made libext2fs thread-aware * The patchset added parallel bitmap loading However, the upstream changes added by Ted only parallelize bitmap loading. File system checking is still single threaded. Reviewing and merging massive patchset is extremely hard and that’s why Ted suggested on the mailing list[4] that we first add support for multithreading to libext2fs. This will allow us to add unit tests for parallelizing libext2fs independently of parallel e2fsck. Once that goes in, we can rebase the rest of the patches on top of libext2fs changes. Saranya spent some effort cleaning up Wang Shilong's patches, and there is a working version of those patches which are based on a recent version of e2fsprogs (just before fast_commit support was integrated) at [2]. However, when we looked more closely at that patch, a fundamental issue of that patch is that the changes to e2fsck to enable multithreaded access to the internal data structures of the libext2fs library made the patches extremely fragile, since it exposed the internal data abstractions of libext2fs into e2fsck. Problem Definition ================== The top level object holding critical information in e2fsprogs is called ext2fil_sys. Every application that links against libext2fs, allocates, updates and frees this struct using libext2fs API [5]. For making any libext2fs application thread-aware, we first need to add the ability in libext2fs to clone this structure so that multiple threads can make progress parallely. Once all the threads finish, we’ll need to add the ability to merge these structures back. So, in other words, we’ll need to add following APIs in libext2fs: /* Clone fs object into dest based on flags */ errcode_t ext2fs_clone_fs(ext2_filsys fs, ext2_filsys *dest, int flags); /* Try to free the FS object. If this object is a clone, merge it with the parent. */ errcode_t ext2fs_free_fs(ext2_filsys fs); Saranya was working on this project; the commit [6] is a work in progress to implement this design. We can either take that code and modify or start from scratch and use that code as a reference. Outcome and Future Direction ============================ At the end of this project, we’ll have an upstream ready patchset. Once these changes are in, the next step would be to drop some patches from Wang’s original e2fsck patchset[1] and rebase the rest of the series on top of the patchset. REFERENCES ========== [1] Wang Shilong’s original parallel e2fsck patchset: http://patchwork.ozlabs.org/project/linux-ext4/list/?series=169193 [2] Wang Shilong's patches rebased and cleaned up versus a relatively recent version of e2fsprogs: https://github.com/tytso/e2fsprogs/tree/pfsck git fetch https://github.com/tytso/e2fsprogs.git pfsck [3] Patches sent by Ted that add parallel bitmap support: https://www.spinics.net/lists/linux-ext4/msg75716.html [4] Ted’s suggested next steps: http://patchwork.ozlabs.org/project/linux-ext4/patch/20201118153947.3394530-11-saranyamohan@xxxxxxxxxx/#2584340 [5] libext2fs API https://github.com/tytso/e2fsprogs/blob/master/lib/ext2fs/ext2fs.h [6] Saranya’s WIP commit that adds clonefs support: https://github.com/srnym/e2fsprogs/commit/3007ba6c47a5caf2e2346d4eb2e05f1333663c2f