Here's a completely different approach to ensuring inode uniqueness. This one was inspired by a suggestion by Al Viro. I'll refer to my earlier email for a description of the problem... We already have what could be considered a unique number for each inode -- the inode pointer address. The problem is converting that into an i_ino value. With this patch, when new_inode is called, we pretend that all of the kernel memory is one huge array of inode pointers, and determine what the position of the pointer would be in the array. We then take that value, and mask off anything higher than 32 bits. Obviously this is a much cheaper operation than keeping track of what's been allocated. Since we're masking off the high bits, we have a chance for collisions when those bits become significant. On my x86_64 FC6 machine, an inode struct is 720 bytes according to slabinfo. The next lowest power of two is 512 (2^9), so we automatically get 9 bits for "free". So this scheme can cope with any situation where two inode addresses are not more than 2^41 (2 petabytes) apart. This calculation was done quickly, so I might be off by one exponentially, but still I think we'd probably be OK for the next several years with this scheme. inode structs are smaller on 32 bit boxes, but they won't have 64-bit pointers so this won't be an issue there. There are a couple of problems, but I think this patch should address them too: 1) because the slab allocator tends to reuse slab objects quickly, i_ino's get reused quickly. The patch copes with this by removing the initialization of i_generation from alloc_inode, and having new_inode increment that value. This should make sure that when an inode slab object is reused that it at least has a different i_generation than before (barring major page allocation/release churn in the slab). There may be callers of new_inode that assume that the i_generation they get is 0. They'll need to be fixed with this scheme, but that should be fairly easy. 2) this scheme would effectively leak inode addresses into userspace. I'm not sure if that would be exploitable, but it's probably best not to do it. The patch adds a static unsigned int that is initialized to a random value at boot time. We'll xor the inode offset with this value. That should allow for a unique i_ino value, but since the xor mask would be secret, it shouldn't be possible to turn it back into an address. There may be a more secure way to do this. I'm definitely open to suggestions here. Again, patch is still a little rough but this one shouldn't need much work if it looks good. Comments, thoughts, suggestions appreciated. Thanks, Jeff --- linux-2.6.18.noarch/fs/inode.c.ino2uint +++ linux-2.6.18.noarch/fs/inode.c @@ -22,6 +22,7 @@ #include <linux/bootmem.h> #include <linux/inotify.h> #include <linux/mount.h> +#include <linux/random.h> /* * This is needed for the following functions: @@ -98,6 +99,15 @@ static DEFINE_MUTEX(iprune_mutex); struct inodes_stat_t inodes_stat; static kmem_cache_t * inode_cachep __read_mostly; +static unsigned int inode_xor_mask; + +/* convert an inode address into an unsigned int and xor it with a random value + * determined at boot time */ +static inline unsigned int inode_to_uint (struct inode *inode) +{ + return ((((unsigned long) (inode - (struct inode *) 0)) + ^ inode_xor_mask) & 0xffffffff); +} static struct inode *alloc_inode(struct super_block *sb) { @@ -125,7 +135,6 @@ static struct inode *alloc_inode(struct inode->i_size = 0; inode->i_blocks = 0; inode->i_bytes = 0; - inode->i_generation = 0; #ifdef CONFIG_QUOTA memset(&inode->i_dquot, 0, sizeof(inode->i_dquot)); #endif @@ -546,7 +555,6 @@ repeat: */ struct inode *new_inode(struct super_block *sb) { - static unsigned long last_ino; struct inode * inode; spin_lock_prefetch(&inode_lock); @@ -557,7 +565,8 @@ struct inode *new_inode(struct super_blo inodes_stat.nr_inodes++; list_add(&inode->i_list, &inode_in_use); list_add(&inode->i_sb_list, &sb->s_inodes); - inode->i_ino = ++last_ino; + inode->i_ino = inode_to_uint(inode); + inode->i_generation++; inode->i_state = 0; spin_unlock(&inode_lock); } @@ -1393,6 +1402,9 @@ void __init inode_init(unsigned long mem for (loop = 0; loop < (1 << i_hash_shift); loop++) INIT_HLIST_HEAD(&inode_hashtable[loop]); + + /* initialize the xor mask for unique inode generation */ + get_random_bytes(&inode_xor_mask, sizeof(inode_xor_mask)); } void init_special_inode(struct inode *inode, umode_t mode, dev_t rdev) - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html