On Wed, Oct 16, 2024 at 04:02:40PM +0800, Baokun Li wrote: > As server clusters get larger and larger, server maintenance becomes very > difficult. Therefore, timely detection of problems (i.e. online scanning, > similar to e2fsck -fn) and timely and non-stop fixing of problems (i.e. > online fsck, similar to e2fsck -a) have always been the requirements of > our customers. Thus online fsck has been on our TODO list, and it's really > time to start doing it. 😀 As far as online scaning is concerned, if you are using LVM, we can use a combination of dm-snapshot and e2fsck -fn --- that is what the e2scrub command automates. Online fsck is much harder, since it would require back pointers to do this efficienctly. To do this, a general way of solving this would involve a generalized some kind of b-tree or b+tree where changes are managed via jbd2. This could be used so that (for example) if we had a tree which maps block ranges to an inode number, then given a block number, we can figure out which inode "owns" that block. The harder part is those objects that have multiple forward pointers --- for example an inode might have multiple hard links to multiple directories, so we need to handle this somehow. If we had the jbd2-aware b+tree, we could also use this add support for reflink/clone, which would also be cool. If this is something that your team really weants to work on, what I'd suggest is to create a rough design of what the journaled b+tree would look like, and then implement it first, since this is the prerequisite for a huge number of advanced file system features. Implementation should be done in a way that makes it easy for the code to be usable both in the kernel and in e2fsprogs, since life will be much easier if we have e2fsck and debugfs support for the new file system data structures from the very beginning of the development. If your company is willing to invest in the engineering effort to do this, great! But I have to point out that an alternative approach that you should consider is whether XFS might be a closer match for some of your customers' needs. The advantage of ext4 is that it is much simpler and easier to understand that XFS. But as we add these new features, ext4 will get more complex. And so one of the design considerations we should keep in mind is to keep ext4 as simple and miantainable as possible, even as we add new functionality. Cheers, - Ted