On Apr 21, 2015, at 4:56 AM, Jan Kara <jack@xxxxxxx> wrote: > On Mon 20-04-15 10:35:01, Andreas Dilger wrote: >> On Apr 20, 2015, at 6:25 AM, Jan Kara <jack@xxxxxxx> wrote: >>> On Fri 17-04-15 17:53:03, Andreas Dilger wrote: >>>> What do you think about making the on-disk orphan inode numbers store >>>> 64-bit values? That would be easy to do now, and would avoid a format >>>> change in the future if we wanted to use 64-bit inodes. >>>> >>>> That said, if the orphan inode is deleted after orphan recovery (see >>>> more below) the only thing needed for compatibility is to store the >>>> inode number size into the orphan inode somewhere so it could be >>>> changed. Maybe i_version and/or i_generation since they are not >>>> directly user accessible. >>> >>> So orphan entry is cleared once inode isn't orphan anymore. So a clean >>> filesystem currently has completely zeroed out orphan file. Switching to >>> 64-bit inode numbers would be trivial then and you can just pick the >>> format of the orphan file based on the 64BIT_INODE incompat feature >>> we'll have to have in sb anyway. So I don't think we need to do anything in that regard now. >> >> But if someone wants to enable 64BIT_INODE then they need to set this >> flag on the superblock, and it would confuse the kernel to thinking >> that the orphan inode has 64-bit inode numbers, when it still only has >> 32-bit inodes. > > So I'm bit confused. When you set 64BIT_INODE flag, you still need to > walk over all the directory structure and convert all the directories. > Also you presumably enforce the filesystem is clean. At that point the > orphan file is full of zeros so when you mount the fs, kernel will just > start looking at those zeros as 64-bit numbers which is fine. When we have > inode number size also stored within the orphan file, we have to > explicitly convert it. The dir_data feature allows storing extra data for each dirent separately. That would allow enabling 64-bit inodes individually as needed, without the need to convert the whole filesystem at once, or the need to store the 64-bit value for 32-bit inode numbers. >> It seems safer to store the inode number size with the orphan inode. >> One option is to put it in the low byte of the proposed per-block magic, >> so if the inode number size changes the magic will change as well. > > So I don't really mind having inode number as a part of magic but I'm > just wondering about the advantage... Whether the filesystem needs to be clean or not when 64BIT_INODE is turned on is a separate issue that could be decided when that feature is added. Making the last byte of the magic number "4" today is easily done and can be handled in ext4_inode_per_orphan_block() as easily as using "sizeof(u32)" (it would probably be better to change that function to take "struct inode" as the argument instead of "struct super_block"). This gives us flexibility in the future for little effort today. Cheers, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html