Hi All, So I've utilized debugfs to manually shrink my 64TB (corrupt) filesystem by one block group in order to be able to utilize resize2fs. So as I understand it the process basically works as follows (either in a single pass, or even on a per block group basis, and in theory part of a block group): * Identify blocks that's in use that will no longer be available. * Identify inodes that's in use that will no longer be available. * perform an inode scan over the whole filesystem, performing the following: - re-allocate any extents (blocks) to new locations (affects only a single inode). - find links to inodes that won't be available any more. * for each no-longer-available inode, re-allocate a new inode, and update all references to it. * update the superblock to indicate the updated filesystem size. My issue is that I'm busy shrinking a 64TB-128MB down to 56TB, and it's been in excess of 72 hours now. Using debugfs (git master + previously posted custom patch) a check for blocks (testb block count) in use takes almost 11 minutes (most of this time is spent opening the filesystem and the actual check takes a few seconds. I can't imagine that testi is much more complicated than this, and checking a few hundred inodes should also take seconds (there is a bitmap indicating use, testi takes a filespec and can only test a single inode, but a variant of this that takes numbers and uses the bitmaps should be possible, so this too is seconds. A full walk of the inode tree takes approximately 10-12 hours. This is for each of icheck as well as ncheck. In this case we don't care about names of in-use blocks, so both these scans can be combined, and since based on previous checks it's mostly "small reads" that's time consuming, I guess we can assume that a combined scan will be <20 hours. Given that worst case 8TB of data needs to be copied (statistically 7TB), and I've seen reads max out at 700MB/s+ on this system, with writes frequently seeing 450MB, I'm going to guess that migration of 200MB/s is not completely unreasonable. Which means that 8TB worth of block migrations results in ~ 42000 seconds, or just under 12 hours. So full shrink should be approximately 32 hours in total, at 100MB/s two days. I'm now over 3 days. Disk write seems to be going at a few KB/s, and CPU isn't high either, so I can only deduce minuscule reads + writes currently. Unfortunately I did not pass -p to the resize2fs command. In terms of an on-line shrink (in which case I personally don't care if a single shrink takes a week), I've been wondering, also based on comments from Ted regarding preventing a mounted filesystem from allocating from the high block group which I needed to drop to get back to a non-corrupted filesystem. Not sure if it's worth the effort, but still wondering about this. And seeing that a single shrink for me is now sitting at >72 hours and I'll need at least 7 more such iterations, possibly closer to 10 ... might be worth it. - add code to mark maximum block and inode numbers available for allocation. In other words, stop allocating from space that will no longer be available. - re-purpose the defrag code that can migrate blocks online to migrate currently in-use blocks/extents. Might just as well attempt to defrag a little whilst doing this anyway. The tricky (read: hard, based on what I know) part will be to free inodes since additional links can be added at any point in time. So code may need to be added to the code that adds links to add the link to the new inode instead, and a remapping will need to be kept in-kernel during the operation. This can also result in inode numbers for files changing from a userspace perspective which for most applications is unlikely to be a problem, but what about tools like find or tar that utilizes these numbers to determine if two files are hard-links? Or du that uses this to list actual storage used instead of perceived? My use-case is predominantly rsync, where inode numbers may very well also be utilized to determine hard-links (-H option). Another big problem here is that I suspect this will affect general performance negatively even when a resize operation is not in progress. Would love opinions. And yes, I am well aware that shrinking filesystems is not an operation that is performed frequently. In my case it's used as part of migration to lower the number of inodes / block group. Kind Regards, Jaco